Trompt: Towards a Better Deep Neural Network for Tabular Data

About

Tabular data is arguably one of the most commonly used data structures in various practical domains, including finance, healthcare and e-commerce. The inherent heterogeneity allows tabular data to store rich information. However, based on a recently published tabular benchmark, we can see deep neural networks still fall behind tree-based models on tabular datasets. In this paper, we propose Trompt--which stands for Tabular Prompt--a novel architecture inspired by prompt learning of language models. The essence of prompt learning is to adjust a large pre-trained model through a set of prompts outside the model without directly modifying the model. Based on this idea, Trompt separates the learning strategy of tabular data into two parts. The first part, analogous to pre-trained models, focus on learning the intrinsic information of a table. The second part, analogous to prompts, focus on learning the variations among samples. Trompt is evaluated with the benchmark mentioned above. The experimental results demonstrate that Trompt outperforms state-of-the-art deep neural networks and is comparable to tree-based models.

Kuan-Yu Chen, Ping-Han Chiang, Hsin-Rung Chou, Ting-Wei Chen, Tien-Hao Chang• 2023

Related benchmarks

Task	Dataset	Result
Classification	Lung	ACC61.55	96
Classification	Adult	Accuracy81.3	86
Classification	TOX_171	Accuracy66.8	78
Classification	Colon	Accuracy58.7	78
Classification	GLI_85	Accuracy43.96	78
Classification	SMK_CAN_187	Accuracy46.65	72
Classification	ALLAML	Accuracy46.12	72
Classification	HE	Accuracy36.65	66
Classification	HDLSS Datasets Summary	Average Rank34.12	66
Classification	Prostate_GE	Accuracy69.08	64

Showing 10 of 55 rows

Other info

Follow for update

@wizwand_team Discord