Adaptive Problem Generation via Symbolic Representations

About

We present a method for generating training data for reinforcement learning with verifiable rewards to improve small open-weights language models on mathematical tasks. Existing data generation approaches rely on open-loop pipelines and fixed modifications that do not adapt to the model's capabilities. Furthermore, they typically operate directly on word problems, limiting control over problem structure. To address this, we perform modifications in a symbolic problem space, representing each problem as a set of symbolic variables and constraints (e.g., via algebraic frameworks such as SymPy or SMT formulations). This representation enables precise control over problem structure, automatic generation of ground-truth solutions, and decouples mathematical reasoning from linguistic realization. We also show that this results in more diverse generations. To adapt the problem difficulty to the model, we introduce a closed-loop framework that learns modification strategies through prompt optimization in symbolic space. Experimental results demonstrate that both adaptive problem generation and symbolic representation modifications contribute to improving the model's math solving ability.

Teresa Yeo, Myeongho Jeon, Dulaj Weerakoon, Rui Qiao, Alok Prakash, Armando Solar-Lezama, Archan Misra• 2026

Related benchmarks

Task	Dataset	Result
Mathematical Reasoning	GSM8K (test)	Accuracy81.96	797
Mathematical Reasoning	MATH 500	Accuracy61.17	155
Mathematical Reasoning	GSM-Symbolic	GSM-Sym Accuracy78.22	43
Mathematical Reasoning	GSM-PLUS	Acc (Original)61.5	28
Mathematical Reasoning	GSM8K, GSM-Sym, Sym-p1/p2, MATH-500, and GSM-Plus Average (test)	Average Accuracy64.39	11

Showing 5 of 5 rows

Other info

Follow for update

@wizwand_team Discord