Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Science Reasoning on TAL-SCQ5K EN
Loading...
73
Pass@1
MIG
61.56
64.53
67.5
70.47
Feb 1, 2026
Pass@1
Pass@8
Delta Pass@1
Updated 3d ago
Evaluation Results
Method
Method
Links
Pass@1
Pass@8
Delta Pass@1
MIG
Evaluation Mode=In-Dom...
2026.02
73
90
11
Base Model
Evaluation Mode=In-Dom...
2026.02
64
86
-
GRPO
Evaluation Mode=In-Dom...
2026.02
62
81
-
Feedback
Search any
task
Search any
task