Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Mathematical Reasoning on HMMT (Solved Rate %)
Loading...
0.6
Solved Rate
Reflexion
0.03372
0.180735
0.32775
0.474765
May 21, 2025
Solved Rate
Updated 23d ago
Evaluation Results
Method
Method
Links
Solved Rate
Reflexion
Model=Qwen3 32B think,...
2025.05
0.6
ICRL
Model=Qwen3 32B think,...
2025.05
0.6
Self-Refine
Model=Qwen3 32B think,...
2025.05
0.5666
Base
Model=Qwen3 32B think,...
2025.05
0.52
ICRL
Model=Qwen3 32B, Conte...
2025.05
0.3333
Reflexion
Model=Qwen3 32B, Conte...
2025.05
0.2333
ICRL
Model=Llama 4 Maverick...
2025.05
0.2
Self-Refine
Model=Qwen3 32B, Conte...
2025.05
0.1666
Self-Refine
Model=Llama 4 Maverick...
2025.05
0.1333
Self-Refine
Model=Phi-4, Context W...
2025.05
0.1333
Reflexion
Model=Phi-4, Context W...
2025.05
0.1333
ICRL
Model=Phi-4, Context W...
2025.05
0.1333
Reflexion
Model=Llama 4 Maverick...
2025.05
0.1
Base
Model=Qwen3 32B, Conte...
2025.05
0.0914
Base
Model=Llama 4 Maverick...
2025.05
0.085
Base
Model=Phi-4, Context W...
2025.05
0.0555
Feedback
Search any
task
Search any
task