Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Math Reasoning on HMMT Feb 25 (pass@8, mean@8)
Loading...
18.5
Mean @8
MOPD
4.5224
8.1512
11.78
15.4088
May 12, 2026
Mean @8
Pass@8
Updated 21d ago
Evaluation Results
Method
Method
Links
Mean @8
Pass@8
MOPD
Base Model=Qwen3-4B
2026.05
18.5
19.58
GRPO
Base Model=Qwen3-4B
2026.05
16.92
20.94
SDPO
Base Model=Qwen3-4B
2026.05
8.59
13.61
Qwen3-4B
status=base model
2026.05
5.06
10
Feedback
Search any
task
Search any
task