Decision Trees for Decision-Making under the Predict-then-Optimize Framework
About
We consider the use of decision trees for decision-making problems under the predict-then-optimize framework. That is, we would like to first use a decision tree to predict unknown input parameters of an optimization problem, and then make decisions by solving the optimization problem using the predicted parameters. A natural loss function in this framework is to measure the suboptimality of the decisions induced by the predicted input parameters, as opposed to measuring loss using input parameter prediction error. This natural loss function is known in the literature as the Smart Predict-then-Optimize (SPO) loss, and we propose a tractable methodology called SPO Trees (SPOTs) for training decision trees under this loss. SPOTs benefit from the interpretability of decision trees, providing an interpretable segmentation of contextual features into groups with distinct optimal solutions to the optimization problem of interest. We conduct several numerical experiments on synthetic and real data including the prediction of travel times for shortest path problems and predicting click probabilities for news article recommendation. We demonstrate on these datasets that SPOTs simultaneously provide higher quality decisions and significantly lower model complexity than other machine learning approaches (e.g., CART) trained to minimize prediction error.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Minimum Cost Vertex Cover | PDH Artificial (test) | Mean Regret55.07 | 20 | |
| Minimum Cost Vertex Cover | POLSKA Real-life (test) | Mean Regret2.57 | 20 | |
| Minimum Cost Vertex Cover | POLSKA Artificial (test) | Mean Regret117.3 | 20 | |
| Minimum Cost Vertex Cover | PDH Real-life (test) | Mean Regret11.19 | 20 | |
| Minimum Cost Flow Problem (MCFP) | USANet Artificial Size 100 | Mean Regret1.74e+3 | 10 | |
| Minimum Cost Flow Problem (MCFP) | GÉANT Artificial Size 100 | Mean Regret745.4 | 10 | |
| Minimum Cost Flow Problem (MCFP) | USANet Artificial Size 300 | Mean Regret1.72e+3 | 10 | |
| Minimum Cost Flow Problem (MCFP) | GÉANT Artificial Size 300 | Mean Regret747.6 | 10 | |
| Minimum Cost Flow Problem (MCFP) | USANet Real-life Size 100 | Mean Regret178.3 | 10 | |
| Minimum Cost Flow Problem (MCFP) | USANet Real-life Size 300 | Mean Regret145.5 | 10 |