CatBoost: gradient boosting with categorical features support
About
In this paper we present CatBoost, a new open-sourced gradient boosting library that successfully handles categorical features and outperforms existing publicly available implementations of gradient boosting in terms of quality on a set of popular publicly available datasets. The library has a GPU implementation of learning algorithm and a CPU implementation of scoring algorithm, which are significantly faster than other gradient boosting libraries on ensembles of similar sizes.
Anna Veronika Dorogush, Vasily Ershov, Andrey Gulin• 2018
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Classification | Petfinder (test) | Accuracy38.69 | 16 | |
| Regression | Crossed Barrel Dataset (80% uniform sampling) | R^20.76 | 10 | |
| Regression | Cogni-e-Spin Dataset (80% uniform sampling) | R^20.55 | 10 | |
| Surrogate Modeling | Crossed Barrel Dataset biased sampling | R^20.55 | 10 | |
| Surrogate Modeling | Cogni-e-Spin Dataset biased sampling | R^20.36 | 10 | |
| Regression | Lattice Dataset (80% uniform sampling) | R^20.95 | 10 | |
| Surrogate Modeling | Lattice Dataset biased sampling | R^20.67 | 10 | |
| Classification | Airbnb (test) | Accuracy43.56 | 8 | |
| Classification | PAD-UFES-20 (PU20) (test) | Accuracy80.43 | 8 | |
| Classification | CBIS-DDSM Calc (test) | Accuracy72.09 | 8 |
Showing 10 of 15 rows