Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Math reasoning on HMMT25 Nov.
Loading...
15.83
Mean@8
MOPD
9.33
11.0175
12.705
14.3925
May 12, 2026
Mean@8
Pass@8
Updated 20d ago
Evaluation Results
Method
Method
Links
Mean@8
Pass@8
MOPD
Base Model=Qwen3-4B
2026.05
15.83
22.47
GRPO
Base Model=Qwen3-4B
2026.05
13.16
19.53
SDPO
Base Model=Qwen3-4B
2026.05
12.01
18.49
Qwen3-4B
status=base model
2026.05
9.58
13.91
Feedback
Search any
task
Search any
task