Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Symbolic Regression via Neural-Guided Genetic Programming Population Seeding

About

Symbolic regression is the process of identifying mathematical expressions that fit observed output from a black-box process. It is a discrete optimization problem generally believed to be NP-hard. Prior approaches to solving the problem include neural-guided search (e.g. using reinforcement learning) and genetic programming. In this work, we introduce a hybrid neural-guided/genetic programming approach to symbolic regression and other combinatorial optimization problems. We propose a neural-guided component used to seed the starting population of a random restart genetic programming component, gradually learning better starting populations. On a number of common benchmark tasks to recover underlying expressions from a dataset, our method recovers 65% more expressions than a recently published top-performing model using the same experimental setup. We demonstrate that running many genetic programming generations without interdependence on the neural-guided component performs better for symbolic regression than alternative formulations where the two are more strongly coupled. Finally, we introduce a new set of 22 symbolic regression benchmark problems with increased difficulty over existing benchmarks. Source code is provided at www.github.com/brendenpetersen/deep-symbolic-optimization.

T. Nathan Mundhenk, Mikel Landajuela, Ruben Glatt, Claudio P. Santiago, Daniel M. Faissol, Brenden K. Petersen• 2021

Related benchmarks

TaskDatasetResultRank
Symbolic RegressionDGSR benchmark
Recall100
22
Symbolic RegressionNguyen, Livermore, and Keijzer Consolidated 1.0 (Length <= 8)
Exact Recovery Rate98
10
Symbolic RegressionNguyen, Livermore, and Keijzer Consolidated 1.0 (Length 21-30)
Average Exact Recovery Rate1.70e+3
10
Symbolic RegressionNguyen, Livermore, and Keijzer (Consolidated) Length 9-10 1.0
Exact Recovery Rate77
10
Symbolic RegressionNguyen, Livermore, and Keijzer Consolidated 1.0 (Length 11-12)
Average Exact Recovery Rate0.25
10
Symbolic RegressionNguyen, Livermore, and Keijzer (Consolidated) 1.0 (Length 13-14)
Avg Exact Recovery Rate29
10
Symbolic RegressionNguyen, Livermore, and Keijzer Consolidated Length 15-16 1.0
Exact Recovery Rate0.16
10
Symbolic RegressionNguyen, Livermore, and Keijzer (Consolidated) 1.0 (Length 17-20)
Average Exact Recovery Rate2
10
Symbolic RegressionNguyen, Livermore, and Keijzer (Consolidated) Length >= 31 1.0
Exact Recovery Rate0.00e+0
10
Showing 9 of 9 rows

Other info

Follow for update