Learning the Boundary of Solvability: Aligning LLMs to Detect Unsolvable Problems

About

Ensuring large language model (LLM) reliability requires distinguishing objective unsolvability (inherent contradictions) from subjective capability limitations (tasks exceeding model competence). Current LLMs often conflate these dimensions, leading to hallucinations in which they return confident answers to inherently unsolvable queries. To address this issue, we propose a multi-domain dataset containing both solvable and unsolvable questions, UnsolvableQA, together with an alignment framework, UnsolvableRL. First, we construct UnsolvableQA by "Reverse Construction" that systematically injects logical contradictions into otherwise valid reasoning chains. Second, we introduce UnsolvableRL, a reinforcement learning paradigm that balances objective unsolvability detection with calibrated confidence under capability limits. Empirically, our approach achieves robust unsolvability detection (>85% detection rate) and boosts solvable reasoning accuracy from 43.4% to 69.4% on Qwen3-4B-Instruct. Crucially, we identify a data-training interaction: strict alignment constraints induce Capability Collapse without unsolvable data, but act as a regularizer for rigor when such data are included, thereby improving overall robustness. Our code and data are available at https://github.com/sfasfaffa/unsolvableQA .

Dengyun Peng, Qiguang Chen, Bofei Liu, Jiannan Guan, Libo Qin, Zheng Yan, Jinhao Liu, Jianshu Zhang, Wanxiang Che• 2025

Related benchmarks

Task	Dataset	Result
Problem Solving and Unsolvability Detection	Game24	Solvable Accuracy94.5	7
Problem Solving and Unsolvability Detection	HamCycle	Solvable Accuracy41.1	7
Problem Solving and Unsolvability Detection	HamPath	Solvable Accuracy55	7
Problem Solving and Unsolvability Detection	Hitori	Accuracy (Solvable)63.5	7
Problem Solving and Unsolvability Detection	Maze Easy	Accuracy (Solvable)96.5	7
Problem Solving and Unsolvability Detection	AIME 24/25	Solvable Accuracy69.6	7
Problem Solving and Unsolvability Detection	Overall	Solvable Accuracy69.4	7
Zebra Logic Puzzle Solving	Zebra Logic Unsolvable	Accuracy87	7
8-Puzzle Solving	8-Puzzle Solvable	Accuracy26.4	7
8-Puzzle Solving	8-Puzzle Unsolvable	Accuracy0.158	7

Showing 10 of 17 rows

Other info

Follow for update

@wizwand_team Discord