Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Infinite Feature Selection: A Graph-based Feature Filtering Approach

About

We propose a filtering feature selection framework that considers subsets of features as paths in a graph, where a node is a feature and an edge indicates pairwise (customizable) relations among features, dealing with relevance and redundancy principles. By two different interpretations (exploiting properties of power series of matrices and relying on Markov chains fundamentals) we can evaluate the values of paths (i.e., feature subsets) of arbitrary lengths, eventually go to infinite, from which we dub our framework Infinite Feature Selection (Inf-FS). Going to infinite allows to constrain the computational complexity of the selection process, and to rank the features in an elegant way, that is, considering the value of any path (subset) containing a particular feature. We also propose a simple unsupervised strategy to cut the ranking, so providing the subset of features to keep. In the experiments, we analyze diverse settings with heterogeneous features, for a total of 11 benchmarks, comparing against 18 widely-known comparative approaches. The results show that Inf-FS behaves better in almost any situation, that is, when the number of features to keep are fixed a priori, or when the decision of the subset cardinality is part of the process.

Giorgio Roffo, Simone Melzi, Umberto Castellani, Alessandro Vinciarelli, Marco Cristani• 2020

Related benchmarks

TaskDatasetResultRank
Feature Selection1000x4-3 +2NF synthetic
Mean Proportion Correct33
8
Feature Selection2NF synthetic 1000x4-5
Mean Correct Features Proportion33
8
Feature SelectionSynthetic 1000x4-10 +2NF
Mean Selection Proportion33
8
Feature SelectionSynthetic 1000x10-3 +5NF
Mean Correct Feature Proportion33
8
Feature Selection1000x10-5 +5NF synthetic
Mean Proportion Correct Features33
8
Feature Selection1000x10-10 +5NF synthetic
Mean Proportion Correct33
8
Feature SelectionSynthetic 2000x20-5 +10NF
Mean Selection Rate33
8
Feature SelectionSynthetic 2000x20-10 +10NF
Mean Proportion Correct33
8
Feature SelectionSynthetic 2000x20-20 +10NF
Mean Correct Feature Proportion33
8
Feature SelectionSynthetic 2000x30-5 +15NF
Mean Correct Feature Proportion33
8
Showing 10 of 22 rows

Other info

Follow for update