Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Scientific Reasoning on GPQA Diamond (pass@1)
Loading...
69.5
pass@1
SPLA
59.204
61.877
64.55
67.223
Jan 29, 2026
pass@1
Updated 4d ago
Evaluation Results
Method
Method
Links
pass@1
SPLA
Model Size=14B, Temper...
2026.01
69.5
SPA
Model Size=14B, Temper...
2026.01
69.2
InfLLM-v2
Model Size=14B, Temper...
2026.01
68.7
Dense Attention
Model Size=14B, Temper...
2026.01
68.5
NSA
Model Size=14B, Temper...
2026.01
59.6
Feedback
Search any
task
Search any
task