Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

PINE: Pruning Boosted Tree Ensembles with Conformal In-Distribution Prediction Equivalence

About

Tree ensembles are machine learning models with strong predictive performance and interpretability, and remain widely used for tabular data. Standard pruning methods for tree ensembles typically optimize an accuracy-compression trade-off and may change a subset of predictions, potentially compromising decision consistency. Faithful pruning methods address this issue by preserving prediction equivalence over the entire input space, but this requirement leads to lower compression ratios. We propose PINE, a pruning method that provides strong guarantees within an in-distribution region. PINE preserves prediction equivalence within this region and controls the region size using a single parameter $\alpha$ via conformal calibration. Experiments on 12 public tabular datasets show that PINE improves the compression ratio by up to 30% while preserving predictions at a comparable level to existing faithful pruning methods.

Haruki Yajima, Yusuke Matsui• 2026

Related benchmarks

TaskDatasetResultRank
Ensemble Pruning Fidelity12 datasets mean performance (test)
Fidelity99.57
19
Pruning Boosted Tree EnsemblesAdult
Pruning Rate47.3
7
Pruning Boosted Tree EnsemblesBalance Scale
Pruning Rate74.7
7
Pruning Boosted Tree EnsemblesBreast Cancer Wisconsin
Pruning Rate88
7
Pruning Boosted Tree EnsemblesCOMPAS-ProPublica
Pruning Rate82.7
7
Pruning Boosted Tree Ensembleselec2
Pruning Rate49.3
7
Pruning Boosted Tree EnsemblesFICO
Pruning Rate60
7
Pruning Boosted Tree EnsemblesHTRU2
Pruning Rate81.3
7
Pruning Boosted Tree EnsemblesJM1
Pruning Rate56
7
Pruning Boosted Tree EnsemblesPima Diabetes
Pruning Rate55.3
7
Showing 10 of 13 rows

Other info

Follow for update