Evasion and Hardening of Tree Ensemble Classifiers
About
Classifier evasion consists in finding for a given instance $x$ the nearest instance $x'$ such that the classifier predictions of $x$ and $x'$ are different. We present two novel algorithms for systematically computing evasions for tree ensembles such as boosted trees and random forests. Our first algorithm uses a Mixed Integer Linear Program solver and finds the optimal evading instance under an expressive set of constraints. Our second algorithm trades off optimality for speed by using symbolic prediction, a novel algorithm for fast finite differences on tree ensembles. On a digit recognition task, we demonstrate that both gradient boosted trees and random forests are extremely susceptible to evasions. Finally, we harden a boosted tree model without loss of predictive accuracy by augmenting the training set of each boosting round with evading instances, a technique we call adversarial boosting.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Multi-Class Formal Verification | covtype robust | PAR2 Runtime3.09e+3 | 2 | |
| Multi-Class Formal Verification | covtype unrobust | PAR2 Runtime3.09e+3 | 2 | |
| Multi-Class Formal Verification | fashion robust | PAR2 Runtime5.67e+3 | 2 | |
| Multi-Class Formal Verification | fashion unrobust | PAR2 Runtime5.00e+3 | 2 | |
| Multi-Class Formal Verification | MNIST ori robust | PAR2 Runtime3.34e+3 | 2 | |
| Multi-Class Formal Verification | MNIST ori unrobust | PAR2 Runtime3.59e+3 | 2 | |
| Multi-Class Formal Verification | Iris | PAR2 Runtime0.01 | 2 | |
| Multi-Class Formal Verification | Red-Wine | PAR2 Runtime3.89 | 2 |