Share your thoughts, 1 month free Claude Pro on usSee more

Problem Solving and Unsolvability Detection on Maze Hard

98Solvable Accuracy

Deepseek-V3.2-R

Updated 5mo ago

Evaluation Results

Method	Links
Deepseek-V3.2-R 2025.12		98	98	98
Gemini-3 2025.12		96	91	93.5
GPT-5.1-Low 2025.12		86	94	90
Qwen3-4B + UnsolvableRL 2025.12		65.8	89	77.9
Qwen3-4B Instruct 2025.12		13.3	64.5	38.9
Qwen3-1.7B Instruct 2025.12		0	80	40
Qwen3-1.7B + UnsolvableRL 2025.12		0	79	39.5