Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Reward Modeling on Anthropic Helpful-Harmless (HHH)

0.7108RewardBench Total

Full Set

0.678560.686930.69530.70367Aug 6, 2025
Updated 16d ago

Evaluation Results

MethodLinks
2025.08
0.7108
0.7052
2025.08
0.6798