Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

QuestA: Expanding Reasoning Capacity in LLMs via Question Augmentation

About

Reinforcement learning (RL) has emerged as a central paradigm for training large language models (LLMs) in reasoning tasks. Yet recent studies question RL's ability to incentivize reasoning capacity beyond the base model. This raises a key challenge: how can RL be adapted to solve harder reasoning problems more effectively? To address this challenge, we propose a simple yet effective strategy via Question Augmentation: introduce partial solutions during training to reduce problem difficulty and provide more informative learning signals. Our method, QuestA, when applied during RL training on math reasoning tasks, not only improves pass@1 but also pass@k-particularly on problems where standard RL struggles to make progress. This enables continual improvement over strong open-source models such as DeepScaleR and OpenMath Nemotron, further enhancing their reasoning capabilities. We achieve new state-of-the-art results on math benchmarks using 1.5B-parameter models: 72.50% (+10.73%) on AIME24, 62.29% (+12.79%) on AIME25, and 41.67% (+10.11%) on HMMT25. Code, data and model are available at https://github.com/foreverlasting1202/QuestA.

Jiazheng Li, Hongzhou Lin, Hong Lu, Kaiyue Wen, Zaiwen Yang, Jiaxuan Gao, Yi Wu, Jingzhao Zhang• 2025

Related benchmarks

TaskDatasetResultRank
Mathematical ReasoningAIME 2025
Accuracy62.08
227
Mathematical ReasoningAIME 24
Accuracy74.26
154
Mathematical ReasoningMATH 500
Accuracy (Acc)94.05
149
Mathematical ReasoningAMC 2023
Accuracy93.44
124
Mathematical ReasoningOlympiadBench
Accuracy78.53
81
Mathematical ReasoningOlympiadBench
Accuracy0.7228
72
Mathematical ReasoningBRUMO25
Accuracy73.75
62
Mathematical ReasoningAIME 25
Pass@1 Accuracy64.99
56
Mathematical ReasoningAMC 23
Accuracy95.1
56
Mathematical ReasoningMinerva Math
Accuracy32.08
54
Showing 10 of 21 rows

Other info

Follow for update