Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Mathematical Reasoning on AIME 24 (Score)
Loading...
3.33
Score
TD-MNPO
-0.1332
0.7659
1.665
2.5641
Sep 27, 2025
Score
Updated 11d ago
Evaluation Results
Method
Method
Links
Score
TD-MNPO
2025.09
3.33
HT-MNPO
Reward Model=ArmoRM-Ll...
2025.09
3.33
SFT Model
2025.09
0
DPO
2025.09
0
SimPO
2025.09
0
SPPO
2025.09
0
INPO
2025.09
0
HT-MNPO
Reward Model=Skywork-R...
2025.09
0
HT-MNPO
Reward Model=Athene-RM-8B
2025.09
0
Feedback
Search any
task
Search any
task