Data-Aware and Scalable Sensitivity Analysis for Decision Tree Ensembles

About

Decision tree ensembles are widely used in critical domains, making robustness and sensitivity analysis essential to their trustworthiness. We study the feature sensitivity problem, which asks whether an ensemble is sensitive to a specified subset of features -- such as protected attributes -- whose manipulation can alter model predictions. Existing approaches often yield examples of sensitivity that lie far from the training distribution, limiting their interpretability and practical value. We propose a data-aware sensitivity framework that constrains the sensitive examples to remain close to the dataset, thereby producing realistic and interpretable evidence of model weaknesses. To this end, we develop novel techniques for data-aware search using a combination of mixed-integer linear programming (MILP) and satisfiability modulo theories (SMT) encodings. Our contributions are fourfold. First, we strengthen the NP-hardness result for sensitivity verification, showing it holds even for trees of depth 1. Second, we develop MILP-optimizations that significantly speed up sensitivity verification for single ensembles and for the first time can also handle multiclass tree ensembles. Third, we introduce a data-aware framework generating realistic examples close to the training distribution. Finally, we conduct an extensive experimental evaluation on large tree ensembles, demonstrating scalability to ensembles with up to 800 trees of depth 8, achieving substantial improvements over the state of the art. This framework provides a practical foundation for analyzing the reliability and fairness of tree-based models in high-stakes applications.

Namrita Varshney, Ashutosh Gupta, Arhaan Ahmad, Tanay V. Tayal, S. Akshay• 2026

Related benchmarks

Task	Dataset	Result
Multi-Class Formal Verification	covtype robust	PAR2 Runtime139.1	2
Multi-Class Formal Verification	covtype unrobust	PAR2 Runtime213.8	2
Multi-Class Formal Verification	fashion robust	PAR2 Runtime118.8	2
Multi-Class Formal Verification	fashion unrobust	PAR2 Runtime67.63	2
Multi-Class Formal Verification	MNIST ori robust	PAR2 Runtime108.8	2
Multi-Class Formal Verification	MNIST ori unrobust	PAR2 Runtime76	2
Multi-Class Formal Verification	Iris	PAR2 Runtime0.01	2
Multi-Class Formal Verification	Red-Wine	PAR2 Runtime3.83	2

Showing 8 of 8 rows

Other info

Follow for update

@wizwand_team Discord