Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Preference Alignment on HelpSteer3
Loading...
-5.89
Score
DPO+Filter
-10.6116
-9.3858
-8.16
-6.9342
Oct 10, 2025
Score
Winrate
Updated 21d ago
Evaluation Results
Method
Method
Links
Score
Winrate
DPO+Filter
Base Architecture=Qwen...
2025.10
-5.89
73
DPO
Base Architecture=Qwen...
2025.10
-6.56
67
DPO+Filter
Base Architecture=Qwen...
2025.10
-6.83
68
DPO
Base Architecture=Qwen...
2025.10
-6.91
67
Base
Base Architecture=Qwen...
2025.10
-8.39
50
DPO+Filter
Base Architecture=Llam...
2025.10
-9.06
66
DPO
Base Architecture=Llam...
2025.10
-9.27
58
DPO+Filter
Base Architecture=Llam...
2025.10
-9.44
60
DPO
Base Architecture=Llam...
2025.10
-9.74
56
Base
Base Architecture=Llam...
2025.10
-10.43
50
Feedback
Search any
task
Search any
task