Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Model Ranking Prediction on Helpsteer 30B+ Models Holdout (test)

76.5Pairwise Accuracy (RM1)

BENCHALIGN

54.24460.02265.871.578Feb 2, 2026
Updated 4d ago

Evaluation Results

MethodLinks
2026.02
76.563.70.710.387
2026.02
60.849.10.324-0.013
2026.02
60.649.70.3280.007
2026.02
55.145.20.108-0.148