XNNTab -- Interpretable Neural Networks for Tabular Data using Sparse Autoencoders

About

In data-driven applications relying on tabular data, where interpretability is key, machine learning models such as decision trees and linear regression are applied. Although neural networks can provide higher predictive performance, they are not used because of their blackbox nature. In this work, we present XNNTab, a neural architecture that combines the expressiveness of neural networks and interpretability. XNNTab first learns highly non-linear feature representations, which are decomposed into monosemantic features using a sparse autoencoder (SAE). These features are then assigned human-interpretable concepts, making the overall model prediction intrinsically interpretable. XNNTab outperforms interpretable predictive models, and achieves comparable performance to its non-interpretable counterparts.

Khawla Elhadri, J\"org Schl\"otterer, Christin Seifert• 2025

Related benchmarks

Task	Dataset	Result
Sentiment Classification	CR	Accuracy78	142
Classification	GE	Accuracy66.5	65
Classification	CO	Accuracy0.968	43
Classification	AD	Accuracy85	17
Classification	CH	Accuracy86.1	7
Tabular Classification	CH	Macro F175.2	6
Tabular Classification	SB	Macro F194.8	6
Tabular Classification	CO	Macro F187.8	6
Tabular Classification	GE	Macro F163.4	6
Tabular Classification	CR	Macro F10.7	6

Showing 10 of 14 rows

Other info

Follow for update

@wizwand_team Discord