Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Logic Reasoning on CommonsenseQA
Loading...
69.8
Pass@1
MIG
68.968
69.184
69.4
69.616
Feb 1, 2026
Pass@1
Pass@8
Delta Pass@1
Updated 4d ago
Evaluation Results
Method
Method
Links
Pass@1
Pass@8
Delta Pass@1
MIG
Evaluation Mode=Out-of...
2026.02
69.8
93.4
0.4
GRPO
Evaluation Mode=Out-of...
2026.02
69.4
92.8
-
Base Model
Evaluation Mode=Out-of...
2026.02
69
92.8
-
Feedback
Search any
task
Search any
task