Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Direct Preference Optimization on RLHFlow AlpacaEval 2.0
Loading...
19.85
LCWR
Difficulty-Based Preference Data Selection
1.8788
6.5444
11.21
15.8756
Aug 6, 2025
LCWR
Win Rate
Updated 16d ago
Evaluation Results
Method
Method
Links
LCWR
Win Rate
Difficulty-Based Preference Data Selection
Selection Method=Ours
2025.08
19.85
19.44
Full Set
Selection Method=Full Set
2025.08
18.74
17.93
Random
Selection Method=Random
2025.08
18.57
18.13
ZIP†
Selection Method=ZIP
2025.08
18.34
18.06
SDPO
Selection Method=SDPO
2025.08
18.09
17.83
DiverseEvol†
Selection Method=Diver...
2025.08
17.52
16.73
Tulu3-SFT
Selection Method=SFT B...
2025.08
2.57
2.16
Feedback
Search any
task
Search any
task