Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Value Modeling on DAPO-Math-17k Qwen3-4B-Instruct-2507 policy (Held-out)
Loading...
0.689
Intra AUC
V0
0.50492
0.55271
0.6005
0.64829
Feb 3, 2026
Intra AUC
Pairwise Acc
Calibration MSE
Updated 1mo ago
Evaluation Results
Method
Method
Links
Intra AUC
Pairwise Acc
Calibration MSE
V0
Protocol=Strict Genera...
2026.02
0.689
0.804
0.138
Vanilla Value Model
Protocol=Strict Genera...
2026.02
0.512
0.304
0.474
Feedback
Search any
task
Search any
task