Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Symbolic Regression with a Learned Concept Library

About

We present a novel method for symbolic regression (SR), the task of searching for compact programmatic hypotheses that best explain a dataset. The problem is commonly solved using genetic algorithms; we show that we can enhance such methods by inducing a library of abstract textual concepts. Our algorithm, called LaSR, uses zero-shot queries to a large language model (LLM) to discover and evolve concepts occurring in known high-performing hypotheses. We discover new hypotheses using a mix of standard evolutionary steps and LLM-guided steps (obtained through zero-shot LLM queries) conditioned on discovered concepts. Once discovered, hypotheses are used in a new round of concept abstraction and evolution. We validate LaSR on the Feynman equations, a popular SR benchmark, as well as a set of synthetic tasks. On these benchmarks, LaSR substantially outperforms a variety of state-of-the-art SR approaches based on deep learning and evolutionary algorithms. Moreover, we show that LaSR can be used to discover a novel and powerful scaling law for LLMs.

Arya Grayeli, Atharva Sehgal, Omar Costilla-Reyes, Miles Cranmer, Swarat Chaudhuri• 2024

Related benchmarks

TaskDatasetResultRank
Symbolic RegressionE. coli growth LLM-SR Suite
NMSE0.0085
44
Symbolic RegressionOscillation 1 LLM-SR Suite
NMSE7.15e-6
30
Symbolic RegressionLSR-Synth
Overall Acc (Tol 0.01)16.02
22
Symbolic RegressionStress–Strain (OOD)
NMSE0.0618
18
Symbolic RegressionCRK (ID)
NMSE9.55e-11
18
Symbolic RegressionStress–Strain (ID)
NMSE0.0164
18
Symbolic RegressionCRK (OOD)
NMSE7.25e-8
18
Symbolic RegressionOscillator 2 (ID)
NMSE2.68e-8
18
Symbolic RegressionOscillator 2 (OOD)
NMSE3.51e-5
18
Symbolic RegressionOscillator 1 (OOD)
NMSE0.0054
18
Showing 10 of 15 rows

Other info

Code

Follow for update