Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Next-token reasoning on OMNI-MATH Hard (val)
Loading...
38.1
Accuracy
LoopRPT
6.2136
14.4918
22.77
31.0482
Mar 20, 2026
Accuracy
Average Step Count
Updated 26d ago
Evaluation Results
Method
Method
Links
Accuracy
Average Step Count
LoopRPT
Model=Ouro-2.6B, Reaso...
2026.03
38.1
4
LoopRPT
Model=Ouro-2.6B, Reaso...
2026.03
37.24
2.28
LoopRPT
Model=Ouro-1.4B, Reaso...
2026.03
34.82
3.07
LoopRPT
Model=Ouro-1.4B, Reaso...
2026.03
34.74
4
Peak
Model=Ouro-2.6B
2026.03
34.52
4
Adap.
Model=Ouro-2.6B
2026.03
34.35
3.51
Adap.
Model=Ouro-1.4B
2026.03
33.91
3.75
Peak
Model=Ouro-1.4B
2026.03
33.79
4
Vanilla
Model=Qwen3-1.7B
2026.03
19.19
-
+CoT
Model=Qwen3-1.7B
2026.03
7.44
-
Feedback
Search any
task
Search any
task