Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Scientific Reasoning on PHYBench
Loading...
30.34
Accuracy
InftyThink+
15.6864
19.4907
23.295
27.0993
Feb 6, 2026
Accuracy
Token Count
Latency
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
Token Count
Latency
InftyThink+
RL Setting=Task reward...
2026.02
30.34
30.34
219.01
InftyThink+
RL Setting=Task and ef...
2026.02
29.62
14.63
97.57
Vanilla
RL Setting=Task reward...
2026.02
20.11
20.46
320.06
InftyThink+
RL Setting=Cold start...
2026.02
19.14
14.6
94.55
Vanilla
RL Setting=Cold start...
2026.02
16.25
13.44
149.38
Feedback
Search any
task
Search any
task