Complexity-Aware Deep Symbolic Regression with Robust Risk-Seeking Policy Gradients
About
We propose a novel deep symbolic regression approach to enhance the robustness and interpretability of data-driven mathematical expression discovery. Our work is aligned with the popular DSR framework which focuses on learning a data-specific expression generator, without relying on pretrained models or additional search or planning procedures. Despite the success of existing DSR methods, they are built on recurrent neural networks, solely guided by data fitness, and potentially meet tail barriers that can zero out the policy gradient, causing inefficient model updates. To overcome these limitations, we design a decoder-only architecture that performs attention in the frequency domain and introduce a dual-indexed position encoding to conduct layer-wise generation. Second, we propose a Bayesian information criterion (BIC)-based reward function that can automatically adjust the trade-off between expression complexity and data fitness, without the need for explicit manual tuning. Third, we develop a ranking-based weighted policy update method that eliminates the tail barriers and enhances training effectiveness. Extensive benchmarks and systematic experiments demonstrate the advantages of our approach. We have released our implementation at https://github.com/ZakBastiani/CADSR.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Symbolic Regression | SRBench black-box (test) | R^20.6155 | 53 | |
| Symbolic Regression | SRBench known solutions 0.1% noise | Symbolic Solution Rate34.27 | 18 | |
| Symbolic Regression | SRBench known solutions 1% noise | Symbolic Solution Rate33.57 | 18 | |
| Symbolic Regression | SRBench known solutions 10% noise | Symbolic Solution Rate29.25 | 18 | |
| Symbolic Regression | SRBench known solutions 0.0% noise | Solution Rate41.83 | 18 | |
| Symbolic Regression | Crack initiation (train) | R20.636 | 6 | |
| Symbolic Regression | Crack initiation (test) | R262.6 | 6 | |
| Symbolic Regression | Crack initiation prediction dataset (train test) | R^20.626 | 5 |