Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Reward Modeling on Unified-Feedback ID (test)
Loading...
71.5
Reward Score
GRM w/ sft
63.596
65.648
67.7
69.752
Jun 14, 2024
Reward Score
Updated 4d ago
Evaluation Results
Method
Method
Links
Reward Score
GRM w/ sft
Base Model=gemma-2B-it...
2024.06
71.5
GRM w/ dpo-noref
Base Model=gemma-2B-it...
2024.06
71.4
GRM w/ dpo
Base Model=gemma-2B-it...
2024.06
70.2
Classifier + Ensemble
Base Model=gemma-2B-it...
2024.06
69.9
Classifier + margin
Base Model=gemma-2B-it...
2024.06
69.6
Classifier (baseline)
Base Model=gemma-2B-it...
2024.06
68.8
Classifier + label smooth
Base Model=gemma-2B-it...
2024.06
68.5
Classifier (Frozen)
Base Model=gemma-2B-it...
2024.06
63.9
Feedback
Search any
task
Search any
task