Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Free Lunch in the Forest: Functionally-Identical Pruning of Boosted Tree Ensembles

About

Tree ensembles, including boosting methods, are highly effective and widely used for tabular data. However, large ensembles lack interpretability and require longer inference times. We introduce a method to prune a tree ensemble into a reduced version that is "functionally identical" to the original model. In other words, our method guarantees that the prediction function stays unchanged for any possible input. As a consequence, this pruning algorithm is lossless for any aggregated metric. We formalize the problem of functionally identical pruning on ensembles, introduce an exact optimization model, and provide a fast yet highly effective method to prune large ensembles. Our algorithm iteratively prunes considering a finite set of points, which is incrementally augmented using an adversarial model. In multiple computational experiments, we show that our approach is a "free lunch", significantly reducing the ensemble size without altering the model's behavior. Thus, we can preserve state-of-the-art performance at a fraction of the original model's size.

Youssouf Emine, Alexandre Forel, Idriss Malek, Thibaut Vidal• 2024

Related benchmarks

TaskDatasetResultRank
Pruning Boosted Tree EnsemblesAdult
Pruning Rate21.3
7
Pruning Boosted Tree EnsemblesBalance Scale
Pruning Rate71.3
7
Pruning Boosted Tree EnsemblesBreast Cancer Wisconsin
Pruning Rate26
7
Pruning Boosted Tree EnsemblesCOMPAS-ProPublica
Pruning Rate58
7
Pruning Boosted Tree Ensembleselec2
Pruning Rate30
7
Pruning Boosted Tree EnsemblesFICO
Pruning Rate26
7
Pruning Boosted Tree EnsemblesHTRU2
Pruning Rate62
7
Pruning Boosted Tree EnsemblesJM1
Pruning Rate6
7
Pruning Boosted Tree EnsemblesPima Diabetes
Pruning Rate17.3
7
Pruning Boosted Tree EnsemblesPOL
Pruning Rate52
7
Showing 10 of 23 rows

Other info

Follow for update