Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

SCOPE-FE: Structured Control of Operator and Pairwise Exploration for Feature Engineering

About

Automatic feature engineering is an effective approach for improving predictive performance in tabular learning. However, expand-and-reduce methods, such as OpenFE, become increasingly computationally expensive as the input dimensionality grows. This limitation arises primarily from the combinatorial explosion of candidate features generated through operator-feature combinations. To address this issue, we propose SCOPE-FE, a structured search space control framework that improves efficiency by reducing the candidate space prior to feature generation. SCOPE-FE jointly regulates two major sources of combinatorial growth: the operator space and feature-pair space. First, OperatorProbing estimates the dataset-specific utility of candidate operators and eliminates low-contribution operators in advance. Second, FeatureClustering employs spectral embedding and fuzzy c-means clustering to group structurally related features, thereby restricting candidate generation to relevant within-cluster combinations. In addition, we introduce ReliabilityScoring, which incorporates variance across subsamples to stabilize pruning decisions. Experiments on ten benchmark datasets demonstrate that SCOPE-FE substantially reduces feature engineering time while maintaining competitive predictive performance relative to existing baselines. The efficiency gains are particularly pronounced for high-dimensional datasets. These results indicate that structured control of the search space is an effective strategy for scalable automatic feature engineering. The code will be made publicly available upon acceptance.

Minhee Park, Seongyeon Son, Yonghyun Lee, Eunchan Kim• 2026

Related benchmarks

TaskDatasetResultRank
ClassificationJA
Accuracy73.1
48
ClassificationCO
Accuracy0.966
43
ClassificationDI
AUC89
7
ClassificationTE
AUC67.3
7
RegressionME
RMSE993.3
7
ClassificationBR
AUC77.1
6
ClassificationNO
AUC0.996
6
RegressionCA
RMSE0.42
5
ClassificationVE
AUC92.7
5
RegressionMI
RMSE0.738
5
Showing 10 of 10 rows

Other info

Follow for update