Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Knowledge Reasoning on GPQA Diamond
Loading...
47.1
Accuracy (avg@8)
DeepSeek-R1-Distill-Qwen-7B
36.388
39.169
41.95
44.731
Dec 18, 2025
Accuracy (avg@8)
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy (avg@8)
DeepSeek-R1-Distill-Qwen-7B
Architecture=Dense, #...
2025.12
47.1
Sigma-MoE-Tiny
Architecture=MoE, # Ac...
2025.12
46.4
DeepSeek-R1-Distill-Llama-8B
Architecture=Dense, #...
2025.12
43.2
Qwen3-1.7B
Architecture=Dense, #...
2025.12
40.1
Phi-3.5-MoE
Architecture=MoE, # Ac...
2025.12
36.8
Feedback
Search any
task
Search any
task