Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Aggregated Logical Reasoning on Overall Mean
Loading...
76.2
Accuracy
Deepseek-V3.2-R
8.912
26.381
43.85
61.319
Dec 1, 2025
Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
Deepseek-V3.2-R
Model=Deepseek-V3.2-R
2025.12
76.2
GPT-5.1-Low
Model=GPT-5.1-Low
2025.12
71.5
Gemini-3.0-Pro
Model=Gemini-3.0-Pro
2025.12
69.9
Qwen3-4B-Instruct + UnsolRL-Final
Base Model=Qwen3-4B-In...
2025.12
34.9
Qwen3-4B-Instruct
Model=Qwen3-4B-Instruct
2025.12
23.2
Qwen3-1.7B-Instruct + UnsolRL-Final
Base Model=Qwen3-1.7B-...
2025.12
13.8
Qwen3-1.7B-Instruct
Model=Qwen3-1.7B-Instruct
2025.12
11.5
Feedback
Search any
task
Search any
task