Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Preference Alignment on Koala (GPT-4o-mini & RM Evaluators)
Loading...
77.75
Win Rate (Reward Model)
Vanilla Baseline
51.75
58.5
65.25
72
May 8, 2026
Win Rate (Reward Model)
Win Rate (GPT-4o-mini)
Average Win Rate
Updated 23d ago
Evaluation Results
Method
Method
Links
Win Rate (Reward Model)
Win Rate (GPT-4o-mini)
Average Win Rate
Vanilla Baseline
Preference Dimensions=...
2026.05
77.75
71.06
72.45
p-soup & Direct Fine-tuning
Preference Dimensions=...
2026.05
70.63
47.75
57.31
Direct Prompting
Preference Dimensions=...
2026.05
52.75
50.31
50.47
Feedback
Search any
task
Search any
task