Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Learning concise representations for regression by evolving networks of trees

About

We propose and study a method for learning interpretable representations for the task of regression. Features are represented as networks of multi-type expression trees comprised of activation functions common in neural networks in addition to other elementary functions. Differentiable features are trained via gradient descent, and the performance of features in a linear model is used to weight the rate of change among subcomponents of each representation. The search process maintains an archive of representations with accuracy-complexity trade-offs to assist in generalization and interpretation. We compare several stochastic optimization approaches within this framework. We benchmark these variants on 100 open-source regression problems in comparison to state-of-the-art machine learning approaches. Our main finding is that this approach produces the highest average test scores across problems while producing representations that are orders of magnitude smaller than the next best performing method (gradient boosting). We also report a negative result in which attempts to directly optimize the disentanglement of the representation result in more highly correlated features.

William La Cava, Tilak Raj Singh, James Taggart, Srinivas Suri, Jason H. Moore• 2018

Related benchmarks

TaskDatasetResultRank
Symbolic RegressionSRBench black-box (test)
R^20.7621
28
Symbolic RegressionStrogatz Dataset epsilon=0.001 (test)
R2 Score0.9244
20
Symbolic RegressionStrogatz Dataset epsilon=0.1 (test)
R292.28
20
Symbolic RegressionStrogatz Dataset epsilon=0.01 (test)
R2 Score0.9244
20
Symbolic RegressionFeynman Dataset epsilon=0.001 (test)
R292.07
20
Symbolic RegressionFeynman Dataset epsilon=0.01 (test)
R20.9212
20
Symbolic RegressionFeynman Dataset epsilon=0.1 (test)
R2 Score0.9195
20
Symbolic RegressionStrogatz Dataset ϵ = 0.0 (test)
R^20.921
20
Symbolic RegressionFeynman Dataset ϵ = 0.0 (test)
R^20.919
20
RegressionClinical Triage Dataset MAP target stratified 75/25 (test)
R^21
6
Showing 10 of 14 rows

Other info

Follow for update