Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Reward Modeling on RewardBench OOD Evaluation

99.4Chat

FsfairX-Llama3-RM-v0.1

95.96896.85997.7598.641May 17, 2025
Updated 1mo ago

Evaluation Results

MethodLinks
2025.05
99.465.187.886.484.7
2025.05
98.363.985.195.885.8
2025.05
98.266.387.895.787
2025.05
96.176.188.186.686.7