Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Mathematical Reasoning on MATH500 (128-sample random subset)
Loading...
71.09
Top-1 Accuracy
ePF
44.2788
51.2394
58.2
65.1606
Oct 7, 2025
Top-1 Accuracy
Updated 18d ago
Evaluation Results
Method
Method
Links
Top-1 Accuracy
ePF
Selection=Argmax, Scor...
2025.10
71.09
PF
Selection=Argmax, Scor...
2025.10
70.31
Best-of-N
Selection=Argmax, Scor...
2025.10
67.96
ePF
Selection=Argmax, Scor...
2025.10
66.42
Beam-Search
Selection=Argmax, Scor...
2025.10
66.4
Self-Consistency
Selection=MV, Base Mod...
2025.10
65.62
Beam-Search
Selection=Argmax, Scor...
2025.10
62.5
Base Sampling
Base Model=Qwen2.5-7B-...
2025.10
60.93
PF
Selection=Argmax, Scor...
2025.10
60.15
Best-of-N
Selection=Argmax, Scor...
2025.10
57.81
Self-Consistency
Selection=MV, Base Mod...
2025.10
53.9
Base Sampling
Base Model=Qwen2.5-1.5...
2025.10
45.31
Feedback
Search any
task
Search any
task