Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Mathematical Reasoning on DEEPMATH 128 samples
Loading...
35.93
Top-1 Accuracy
ePF
9.1188
16.0794
23.04
30.0006
Oct 7, 2025
Top-1 Accuracy
Updated 18d ago
Evaluation Results
Method
Method
Links
Top-1 Accuracy
ePF
Selection=Argmax, Scor...
2025.10
35.93
PF
Selection=Argmax, Scor...
2025.10
34.37
Best-of-N
Selection=Argmax, Scor...
2025.10
32.03
Beam-Search
Selection=Argmax, Scor...
2025.10
32.03
Self-Consistency
Selection=MV, Base Mod...
2025.10
30.46
ePF
Selection=Argmax, Scor...
2025.10
25
Base Sampling
Base Model=Qwen2.5-7B-...
2025.10
23.43
PF
Selection=Argmax, Scor...
2025.10
22.65
Beam-Search
Selection=Argmax, Scor...
2025.10
21.09
Best-of-N
Selection=Argmax, Scor...
2025.10
20.31
Self-Consistency
Selection=MV, Base Mod...
2025.10
13.28
Base Sampling
Base Model=Qwen2.5-1.5...
2025.10
10.15
Feedback
Search any
task
Search any
task