LassoNet: A Neural Network with Feature Sparsity

About

Much work has been done recently to make neural networks more interpretable, and one obvious approach is to arrange for the network to use only a subset of the available features. In linear models, Lasso (or $\ell_1$-regularized) regression assigns zero weights to the most irrelevant or redundant features, and is widely used in data science. However the Lasso only applies to linear models. Here we introduce LassoNet, a neural network framework with global feature selection. Our approach enforces a hierarchy: specifically a feature can participate in a hidden unit only if its linear representative is active. Unlike other approaches to feature selection for neural nets, our method uses a modified objective function with constraints, and so integrates feature selection with the parameter learning directly. As a result, it delivers an entire regularization path of solutions with a range of feature sparsity. On systematic experiments, LassoNet significantly outperforms state-of-the-art methods for feature selection and regression. The LassoNet method uses projected proximal gradient descent, and generalizes directly to deep networks. It can be implemented by adding just a few lines of code to a standard neural network.

Ismael Lemhadri, Feng Ruan, Louis Abraham, Robert Tibshirani• 2019

Related benchmarks

Task	Dataset	Result
Classification	HE	Accuracy39.3	66
Classification	GE	Accuracy61.1	65
Classification	HI	Accuracy0.814	59
Classification	OT	Accuracy81.3	57
Regression	CA Housing	RMSE0.536	54
Classification	JA	Accuracy73.6	48
Classification	AL	Accuracy96.3	43
Classification	AL	Accuracy96.2	43
Classification	CO	Accuracy0.97	43
Classification	EY	Accuracy67.9	43

Showing 10 of 132 rows

...

Other info

Follow for update

@wizwand_team Discord