STRIDE: A Self-Reflective Agent Framework for Reliable Automatic Equation Discovery
About
LLM-based equation discovery offers a promising route to recovering symbolic laws from data, but many systems still rely on generation-centered loops that propose candidates, fit parameters, score results, and reuse selected examples. Such loops can misjudge useful skeletons under unreliable fitting, discard near-correct equations that require repair, and accumulate redundant memories that provide limited guidance. We propose STRIDE, a self-reflective agent framework that improves reliability by coordinating data-aware generation, mixed-fitting evaluation, critic--executor repair, and diversity-preserving semantic memory. By turning fitted scores and candidate behavior into shared feedback, STRIDE enables equations to be proposed, assessed, refined, and reused within a closed-loop discovery process. Experiments on representative symbolic-regression benchmarks and LSR-Synth suites show that STRIDE improves accuracy, OOD robustness, and structural recovery across multiple LLM backbones, with ablations and analyses confirming the contribution of its core components.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Symbolic Regression | Oscillator 1 (OOD) | NMSE5.97e-12 | 32 | |
| Symbolic Regression | E. coli growth (OOD) | Acc0.19.23 | 14 | |
| Symbolic Regression | Oscillator 1 | Accuracy (Tol 0.1)100 | 14 | |
| Symbolic Regression | Oscillator 2 | Accuracy (0.1)100 | 14 | |
| Symbolic Regression | E. coli Growth | Acc@0.114.6 | 14 | |
| Symbolic Regression | Stress-Strain | Accuracy @ 0.188.28 | 14 | |
| Symbolic Regression | PO LSR-SYNTH (OOD) | Acc@0.196.6 | 12 | |
| Symbolic Regression | PO LSR-SYNTH | Accuracy@0.199.92 | 12 | |
| Symbolic Regression | CRK LSR-SYNTH OOD | Acc@0.1100 | 6 | |
| Symbolic Regression | CRK LSR-SYNTH | Acc@0.1100 | 6 |