Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Reward Modeling on Anthropic Helpful-Harmless (HHH)
Loading...
0.7108
RewardBench Total
Full Set
0.67856
0.68693
0.6953
0.70367
Aug 6, 2025
RewardBench Total
Updated 16d ago
Evaluation Results
Method
Method
Links
RewardBench Total
Full Set
Selection Size=100%
2025.08
0.7108
Difficulty-Based Preference Data Selection
Selection Size=10%
2025.08
0.7052
Random
Selection Size=10%
2025.08
0.6798
Feedback
Search any
task
Search any
task