Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

PRM800K

Benchmarks

Task NameDataset NameSOTA ResultTrend
Math ReasoningPRM800K
AUC-ROC0.613
5
Instance-level EvaluationPRM800K
AUC-ROC0.42
1
Showing 2 of 2 rows