Share your thoughts, 1 month free Claude Pro on usSee more

Problem Solving and Unsolvability Detection on Maze Easy

100Accuracy (Solvable)

Deepseek-V3.2-R

Updated 4mo ago

Evaluation Results

Method	Links
Deepseek-V3.2-R 2025.12		100	98.9	99.5
Gemini-3 2025.12		99	94.6	96.8
Qwen3-4B + UnsolvableRL 2025.12		96.5	98.9	97.7
GPT-5.1-Low 2025.12		95	100	97.5
Qwen3-4B Instruct 2025.12		32.7	55.6	44.1
Qwen3-1.7B Instruct 2025.12		0	85.6	42.8
Qwen3-1.7B + UnsolvableRL 2025.12		0	95.2	47.6