Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Deep symbolic regression: Recovering mathematical expressions from data via risk-seeking policy gradients

About

Discovering the underlying mathematical expressions describing a dataset is a core challenge for artificial intelligence. This is the problem of $\textit{symbolic regression}$. Despite recent advances in training neural networks to solve complex tasks, deep learning approaches to symbolic regression are underexplored. We propose a framework that leverages deep learning for symbolic regression via a simple idea: use a large model to search the space of small models. Specifically, we use a recurrent neural network to emit a distribution over tractable mathematical expressions and employ a novel risk-seeking policy gradient to train the network to generate better-fitting expressions. Our algorithm outperforms several baseline methods (including Eureqa, the gold standard for symbolic regression) in its ability to exactly recover symbolic expressions on a series of benchmark problems, both with and without added noise. More broadly, our contributions include a framework that can be applied to optimize hierarchical, variable-length objects under a black-box performance metric, with the ability to incorporate constraints in situ, and a risk-seeking policy gradient formulation that optimizes for best-case performance instead of expected performance.

Brenden K. Petersen, Mikel Landajuela, T. Nathan Mundhenk, Claudio P. Santiago, Soo K. Kim, Joanne T. Kim• 2019

Related benchmarks

TaskDatasetResultRank
Symbolic Regression3D Advection Equation (test)
MSE0.139
60
Symbolic RegressionE. coli growth LLM-SR Suite
NMSE0.182
44
1D Physics Modeling1d Burgers' equation (test)
MSE0.0059
38
1D Advection Equation Modeling1D Advection Equation
MSE0.159
38
Modeling 1D Advection-Diffusion Equation1D Advection-Diffusion Equation S-I (test)
MSE0.0191
38
Symbolic Regression2D Advection Equation (test)
MSE0.26
38
Symbolic RegressionOscillation 1 LLM-SR Suite
NMSE0.0104
30
Symbolic RegressionSRBench black-box (test)
R^20.5625
28
Symbolic RegressionLSR-Synth
Overall Acc (Tol 0.01)0.00e+0
22
Symbolic RegressionStrogatz Dataset epsilon=0.01 (test)
R2 Score0.8199
20
Showing 10 of 53 rows

Other info

Follow for update