CatBoost: Distributed Training, Uncertainty Estimation and Other News

CatBoost: Distributed Training, Uncertainty Estimation and Other News

Update: Video Recording:

Slides: here


After our last 2 meetups with core developers of XGBoost and LightGBM, respectively, it is now CatBoost’s turn (with the head of the CatBoost dev team speaking)! Just as last time, we’ll fit in a 1-hour slot, talk 35 minutes + Q&A 20 minutes (10:00-10:55am Pacific Time).

The zoom link will be posted in comments below at 9:55am and due to our zoom’s 100-attendee limit, the first 100 people will be able to join the zoom call.

CatBoost: Distributed Training, Uncertainty Estimation and Other News
by Stanislav
Kirillov

CatBoost is a popular open-source library for training gradient boosting models, with built-in categorical, text, and embedding features support.
In this talk, we will discuss major updates and recall the main features of CatBoost, including:
* CatBoost for Spark release
* Object embeddings and text features support
* Uncertainty estimation
* GPU training support
* Dataset prequantization support
* Fast inference (both CPU and GPU)

We will show a brief demo of CatBoost PySpark training and present plans for CatBoost development.

Speaker Bio:
Stanislav Kirillov is the head of CatBoost development team at Yandex. He develops machine learning tools, supporting and developing infrastructure for them. Stanislav is a big fan of distributed training and low-level software optimizations.

Date/Time: Tuesday, March 30, 10:00-10:55am Pacific Time
Venue: online (zoom)
RSVP: here on meetup
Note: The zoom link will be posted in comments on meetup (link above) at 9:55am and due to our zoom’s 100-attendee limit, the first 100 people will be able to join the zoom call.

Share This Post

Leave a reply