Share your thoughts, 1 month free Claude Pro on usSee more

Aggregated Logical Reasoning on Overall Solvable

67.3Accuracy

Deepseek-V3.2-R

Updated 5mo ago

Evaluation Results

Method	Links
Deepseek-V3.2-R 2025.12		67.3
Gemini-3.0-Pro 2025.12		54.3
GPT-5.1-Low 2025.12		38.5
Qwen3-4B-Instruct + UnsolRL-Final 2025.12		17.4
Qwen3-4B-Instruct 2025.12		11.8
Qwen3-1.7B-Instruct 2025.12		4.9
Qwen3-1.7B-Instruct + UnsolRL-Final 2025.12		3.3