Talking Trees: Reasoning-Assisted Induction of Decision Trees for Tabular Data
About
Tabular foundation models are becoming increasingly popular for low-resource tabular problems. These models make up for small training datasets by pretraining on large volumes of synthetic data. The prior knowledge obtained via pretraining provides the exceptional performance, but the resulting model becomes a black box that is difficult to interpret and costly for inference. In this work, we explore an alternative strategy: using reasoning-capable LLMs to induce decision trees for small tabular datasets in an agentic setup. We design a minimal set of tools for constructing, analyzing, and manipulating decision trees. Equipped with these tools, the LLM combines its prior knowledge with learning from data to produce a lightweight decision tree that outperforms CART and recent non-greedy tree learners and remains competitive with tree ensembles on low-resource tabular problems. While a single agentic decision tree is competitive with state-of-the-art black box models, it also comes with a human-readable reasoning trace that can be checked for biases and data leaks. Furthermore, the reasoning-based LLM's creation process allows for additional human input to be incorporated into the tree without it being captured in data.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Binary Classification | Fitness TabArena v0.1 (test) | ROC AUC0.828 | 10 | |
| Binary Classification | CreditG TabArena v0.1 (test) | ROC AUC0.792 | 10 | |
| Binary Classification | TabArena Customer v0.1 (test) | ROC AUC0.738 | 10 | |
| Binary Classification | QSARBio TabArena v0.1 (test) | ROC AUC93.7 | 10 | |
| Binary Classification | Hazelnut TabArena v0.1 (test) | ROC AUC99 | 10 | |
| Multiclass Classification | Anneal TabArena v0.1 (test) | LogLoss0.014 | 10 | |
| Multiclass Classification | Phishing TabArena v0.1 (test) | LogLoss0.218 | 10 | |
| Regression | Airfoil TabArena v0.1 (test) | RMSE1.029 | 10 | |
| Regression | Insurance TabArena v0.1 (test) | RMSE4.44e+3 | 10 | |
| Regression | QSARFish TabArena v0.1 (test) | RMSE0.849 | 10 |