Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

DrSR: LLM based Scientific Equation Discovery with Dual Reasoning from Data and Experience

About

Symbolic regression is a fundamental tool for discovering interpretable mathematical expressions from data, with broad applications across scientific and engineering domains. Recently, large language models (LLMs) have demonstrated strong performance in this task, leveraging embedded scientific priors and reasoning capabilities to surpass traditional methods. However, existing LLM-based approaches, such as LLM-SR, often over-rely on internal priors, lacking explicit data understanding and systematic reflection during equation generation. To address these limitations, we propose DrSR (Dual Reasoning Symbolic Regression), a framework that combines data-driven insight with reflective learning to enhance both robustness and discovery capability. Specifically, DrSR guides LLMs to analyze structural relationships (e.g., monotonicity, nonlinearity, and correlation) within the data to generate structured descriptions. Simultaneously, it monitors equation performance and establishes a feedback loop to refine subsequent generations. By integrating data understanding and generation reflection in a closed loop, DrSR enables more efficient exploration of the symbolic expression space. Experiments across interdisciplinary datasets in physics, chemistry, biology, and materials science demonstrate that DrSR substantially improves the valid equation rate and consistently outperforms both classical and recent LLM-based methods in terms of accuracy, generalization, and search efficiency. These results underscore its potential for scientific equation discovery.

Runxiang Wang, Boxiao Wang, Kai Li, Yifan Zhang, Jian Cheng• 2025

Related benchmarks

TaskDatasetResultRank
Symbolic RegressionE. coli growth LLM-SR Suite
NMSE0.0192
44
Symbolic RegressionOscillation 1 LLM-SR Suite
NMSE7.73e-13
30
Symbolic RegressionOscillator 1 (OOD)
NMSE1.87e-6
18
Symbolic RegressionCRK (ID)
NMSE1.42e-10
18
Symbolic RegressionCRK (OOD)
NMSE3.44e-8
18
Symbolic RegressionStress–Strain (OOD)
NMSE0.073
18
Symbolic RegressionStress–Strain (ID)
NMSE0.0211
18
Symbolic RegressionOscillator 2 (ID)
NMSE9.51e-5
18
Symbolic RegressionOscillator 2 (OOD)
NMSE0.0019
18
Showing 9 of 9 rows

Other info

Follow for update