DOFEN: Deep Oblivious Forest ENsemble

About

Deep Neural Networks (DNNs) have revolutionized artificial intelligence, achieving impressive results on diverse data types, including images, videos, and texts. However, DNNs still lag behind Gradient Boosting Decision Trees (GBDT) on tabular data, a format extensively utilized across various domains. In this paper, we propose DOFEN, short for \textbf{D}eep \textbf{O}blivious \textbf{F}orest \textbf{EN}semble, a novel DNN architecture inspired by oblivious decision trees. DOFEN constructs relaxed oblivious decision trees (rODTs) by randomly combining conditions for each column and further enhances performance with a two-level rODT forest ensembling process. By employing this approach, DOFEN achieves state-of-the-art results among DNNs and further narrows the gap between DNNs and tree-based models on the well-recognized benchmark: Tabular Benchmark \citep{grinsztajn2022tree}, which includes 73 total datasets spanning a wide array of domains. The code of DOFEN is available at: \url{https://github.com/Sinopac-Digital-Technology-Division/DOFEN}.

Kuan-Yu Chen, Ping-Han Chiang, Hsin-Rung Chou, Chih-Sheng Chen, Tien-Hao Chang• 2024

Related benchmarks

Task	Dataset	Result
Classification	HE	Accuracy38.58	66
Regression	CA Housing	RMSE0.4584	54
Tabular Classification	NUM (L) (test)	Macro F10.956	18
Classification	medium-sized classification datasets	Accuracy78.05	14
Tabular Classification	WIL M (test)	Macro F1 Score0.939	13
Tabular Classification	WDB S (test)	Macro F1 Score0.97	13
Tabular Classification	MAD M (test)	Macro F1 Score0.58	13
Binary Classification	Jannis JA	Accuracy73.32	9
Binary Classification	Higgs (HI)	Accuracy73.11	9
Regression	YearPredictionMSD	RMSE8.7572	9

Showing 10 of 32 rows

Other info

Code

Follow for update

@wizwand_team Discord