Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Scientific Reasoning on MMLU-Redux (ACC, TOK, LAT)
Loading...
54.64
Accuracy
InftyThink+
50.1992
51.3521
52.505
53.6579
Feb 6, 2026
Accuracy
Token Throughput
Latency
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
Token Throughput
Latency
InftyThink+
RL Setting=Task reward...
2026.02
54.64
4.81
30.42
InftyThink+
RL Setting=Task and ef...
2026.02
53.94
2.29
12.55
InftyThink+
RL Setting=Cold start...
2026.02
51.34
3.12
17.61
Vanilla
RL Setting=Task reward...
2026.02
51.1
5.85
69.27
Vanilla
RL Setting=Cold start...
2026.02
50.37
3.45
28.54
Feedback
Search any
task
Search any
task