Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Helpfulness on HH-RLHF (test)
Loading...
2.542
Reward
MAVIS
2.08648
2.20474
2.323
2.44126
Aug 19, 2025
Reward
KL Divergence
Updated 4d ago
Evaluation Results
Method
Method
Links
Reward
KL Divergence
MAVIS
Model Size=13B
2025.08
2.542
24.11
PPO
Model Size=13B
2025.08
2.452
24.19
MAVIS
Model Size=7B
2025.08
2.311
18.64
PPO
Model Size=7B
2025.08
2.104
17.81
Feedback
Search any
task
Search any
task