Output-Constrained Decision Trees
About
Incorporating domain-specific constraints into machine learning models is essential for generating predictions that are both accurate and feasible in real-world applications. This paper introduces new methods for training Output-Constrained Regression Trees (OCRT), addressing the limitations of traditional decision trees in constrained multi-target regression tasks. We propose three approaches: M-OCRT, which uses split-based mixed integer programming to enforce constraints; E-OCRT, which employs an exhaustive search for optimal splits and solves constrained prediction problems at each decision node; and EP-OCRT, which applies post-hoc constrained optimization to tree predictions. To illustrate their potential uses in ensemble learning, we also introduce a random forest framework working under convex feasible sets. We validate the proposed methods through a computational study both on synthetic and industry-driven hierarchical time series datasets. Our results demonstrate that imposing constraints on decision tree training results in accurate and feasible predictions.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Multi-Target Regression | Synthetic Datasets | Average MSE Gap (%)55.1 | 5 | |
| End-to-End Learning | Synthetic Datasets | Delta r (Δr)-26 | 4 | |
| Hierarchical Time Series Forecasting | HTS Noise-free | Δ45.46 | 4 | |
| Hierarchical Time Series Forecasting | HTS Noisy | Delta (Δ)36.2 | 4 |