Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Adaptive Problem Generation via Symbolic Representations

About

We present a method for generating training data for reinforcement learning with verifiable rewards to improve small open-weights language models on mathematical tasks. Existing data generation approaches rely on open-loop pipelines and fixed modifications that do not adapt to the model's capabilities. Furthermore, they typically operate directly on word problems, limiting control over problem structure. To address this, we perform modifications in a symbolic problem space, representing each problem as a set of symbolic variables and constraints (e.g., via algebraic frameworks such as SymPy or SMT formulations). This representation enables precise control over problem structure, automatic generation of ground-truth solutions, and decouples mathematical reasoning from linguistic realization. We also show that this results in more diverse generations. To adapt the problem difficulty to the model, we introduce a closed-loop framework that learns modification strategies through prompt optimization in symbolic space. Experimental results demonstrate that both adaptive problem generation and symbolic representation modifications contribute to improving the model's math solving ability.

Teresa Yeo, Myeongho Jeon, Dulaj Weerakoon, Rui Qiao, Alok Prakash, Armando Solar-Lezama, Archan Misra• 2026

Related benchmarks

TaskDatasetResultRank
Mathematical ReasoningGSM8K (test)
Accuracy81.96
797
Mathematical ReasoningMATH 500
Accuracy61.17
155
Mathematical ReasoningGSM-Symbolic
GSM-Sym Accuracy78.22
43
Mathematical ReasoningGSM-PLUS
Acc (Original)61.5
28
Mathematical ReasoningGSM8K, GSM-Sym, Sym-p1/p2, MATH-500, and GSM-Plus Average (test)
Average Accuracy64.39
11
Showing 5 of 5 rows

Other info

Follow for update