Learning the Boundary of Solvability: Aligning LLMs to Detect Unsolvable Problems
About
Ensuring large language model (LLM) reliability requires distinguishing objective unsolvability (inherent contradictions) from subjective capability limitations (tasks exceeding model competence). Current LLMs often conflate these dimensions, leading to hallucinations in which they return confident answers to inherently unsolvable queries. To address this issue, we propose a multi-domain dataset containing both solvable and unsolvable questions, UnsolvableQA, together with an alignment framework, UnsolvableRL. First, we construct UnsolvableQA by "Reverse Construction" that systematically injects logical contradictions into otherwise valid reasoning chains. Second, we introduce UnsolvableRL, a reinforcement learning paradigm that balances objective unsolvability detection with calibrated confidence under capability limits. Empirically, our approach achieves robust unsolvability detection (>85% detection rate) and boosts solvable reasoning accuracy from 43.4% to 69.4% on Qwen3-4B-Instruct. Crucially, we identify a data-training interaction: strict alignment constraints induce Capability Collapse without unsolvable data, but act as a regularizer for rigor when such data are included, thereby improving overall robustness. Our code and data are available at https://github.com/sfasfaffa/unsolvableQA .
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Problem Solving and Unsolvability Detection | Game24 | Solvable Accuracy94.5 | 7 | |
| Problem Solving and Unsolvability Detection | HamCycle | Solvable Accuracy41.1 | 7 | |
| Problem Solving and Unsolvability Detection | HamPath | Solvable Accuracy55 | 7 | |
| Problem Solving and Unsolvability Detection | Hitori | Accuracy (Solvable)63.5 | 7 | |
| Problem Solving and Unsolvability Detection | Maze Easy | Accuracy (Solvable)96.5 | 7 | |
| Problem Solving and Unsolvability Detection | AIME 24/25 | Solvable Accuracy69.6 | 7 | |
| Problem Solving and Unsolvability Detection | Overall | Solvable Accuracy69.4 | 7 | |
| Zebra Logic Puzzle Solving | Zebra Logic Unsolvable | Accuracy87 | 7 | |
| 8-Puzzle Solving | 8-Puzzle Solvable | Accuracy26.4 | 7 | |
| 8-Puzzle Solving | 8-Puzzle Unsolvable | Accuracy0.158 | 7 |