Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Step-level quality assessment on PRMBench
Loading...
54.65
Simplicity
CPMI_Merge
26.57
33.86
41.15
48.44
Apr 12, 2026
Simplicity
Soundness
Sensitivity
Total Score
Updated 5d ago
Evaluation Results
Method
Method
Links
Simplicity
Soundness
Sensitivity
Total Score
CPMI_Merge
Reward Type=CPMI_Merge
2026.04
54.65
62.3
56.5
60.67
CPMI
Reward Type=CPMI
2026.04
52.56
60.21
55.59
58.75
PAV
Reward Type=PAV
2026.04
47.16
49.56
47.2
49.64
MC
Reward Type=MC
2026.04
40.88
36.92
35.06
38.8
Rand_Merge
Reward Type=Rand_Merge
2026.04
27.65
21.64
18.65
23.2
Feedback
Search any
task
Search any
task