Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Model Ranking Prediction on Helpsteer 13B+ Models Holdout (test)
Loading...
74.1
Acc_pair (RM1 Helpful)
BENCHALIGN
55.068
60.009
64.95
69.891
Feb 2, 2026
Acc_pair (RM1 Helpful)
Acc_pair (RM2 Helpful)
rho (RM1 Helpful)
rho (RM2 Helpful)
Updated 4d ago
Evaluation Results
Method
Method
Links
Acc_pair (RM1 Helpful)
Acc_pair (RM2 Helpful)
rho (RM1 Helpful)
rho (RM2 Helpful)
BENCHALIGN
Holdout=13B+ Models
2026.02
74.1
69.6
0.674
0.552
METABENCH
Holdout=13B+ Models
2026.02
58.4
53
0.243
0.089
TINYBENCHMARKS
Holdout=13B+ Models
2026.02
57
52
0.203
0.061
RANDOM
Holdout=13B+ Models
2026.02
55.8
53.7
0.125
0.083
Feedback
Search any
task
Search any
task