Adaptive Problem Generation via Symbolic Representations
About
We present a method for generating training data for reinforcement learning with verifiable rewards to improve small open-weights language models on mathematical tasks. Existing data generation approaches rely on open-loop pipelines and fixed modifications that do not adapt to the model's capabilities. Furthermore, they typically operate directly on word problems, limiting control over problem structure. To address this, we perform modifications in a symbolic problem space, representing each problem as a set of symbolic variables and constraints (e.g., via algebraic frameworks such as SymPy or SMT formulations). This representation enables precise control over problem structure, automatic generation of ground-truth solutions, and decouples mathematical reasoning from linguistic realization. We also show that this results in more diverse generations. To adapt the problem difficulty to the model, we introduce a closed-loop framework that learns modification strategies through prompt optimization in symbolic space. Experimental results demonstrate that both adaptive problem generation and symbolic representation modifications contribute to improving the model's math solving ability.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Mathematical Reasoning | GSM8K (test) | Accuracy81.96 | 797 | |
| Mathematical Reasoning | MATH 500 | Accuracy61.17 | 155 | |
| Mathematical Reasoning | GSM-Symbolic | GSM-Sym Accuracy78.22 | 43 | |
| Mathematical Reasoning | GSM-PLUS | Acc (Original)61.5 | 28 | |
| Mathematical Reasoning | GSM8K, GSM-Sym, Sym-p1/p2, MATH-500, and GSM-Plus Average (test) | Average Accuracy64.39 | 11 |