Output-Constrained Decision Trees

About

Incorporating domain-specific constraints into machine learning models is essential for generating predictions that are both accurate and feasible in real-world applications. This paper introduces new methods for training Output-Constrained Regression Trees (OCRT), addressing the limitations of traditional decision trees in constrained multi-target regression tasks. We propose three approaches: M-OCRT, which uses split-based mixed integer programming to enforce constraints; E-OCRT, which employs an exhaustive search for optimal splits and solves constrained prediction problems at each decision node; and EP-OCRT, which applies post-hoc constrained optimization to tree predictions. To illustrate their potential uses in ensemble learning, we also introduce a random forest framework working under convex feasible sets. We validate the proposed methods through a computational study both on synthetic and industry-driven hierarchical time series datasets. Our results demonstrate that imposing constraints on decision tree training results in accurate and feasible predictions.

H\"useyin Tun\c{c}, Do\u{g}anay \"Ozese, \c{S}. \.Ilker Birbil, Donato Maragno, Marco Caserta, Mustafa Baydo\u{g}an• 2024

Related benchmarks

Task	Dataset	Result
Multi-Target Regression	Synthetic Datasets	Average MSE Gap (%)55.1	5
End-to-End Learning	Synthetic Datasets	Delta r (Δr)-26	4
Hierarchical Time Series Forecasting	HTS Noise-free	Δ45.46	4
Hierarchical Time Series Forecasting	HTS Noisy	Delta (Δ)36.2	4

Showing 4 of 4 rows

Other info

Follow for update

@wizwand_team Discord