Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Preference Aggregation on Preference Evaluation Suite Aggregate
Loading...
36.68
Average Preference Win Rate
AlphaToken
1.7568
10.8234
19.89
28.9566
Jun 1, 2026
Average Preference Win Rate
Updated 1d ago
Evaluation Results
Method
Method
Links
Average Preference Win Rate
AlphaToken
Backbone=Qwen-3.5-9B,...
2026.06
36.68
SePO
Backbone=Qwen-3.5-9B,...
2026.06
34.03
ConfPO
Backbone=Qwen-3.5-9B,...
2026.06
33.82
TI-DPO
Backbone=Qwen-3.5-9B,...
2026.06
32.33
DPO
Backbone=Qwen-3.5-9B,...
2026.06
32.11
AlphaToken
Backbone=Gemma-3-4B, W...
2026.06
29.63
ConfPO
Backbone=Gemma-3-4B, W...
2026.06
26.68
SePO
Backbone=Gemma-3-4B, W...
2026.06
26.62
TI-DPO
Backbone=Gemma-3-4B, W...
2026.06
24.99
DPO
Backbone=Gemma-3-4B, W...
2026.06
20.94
AlphaToken
Backbone=Llama-3.2-3B,...
2026.06
17.62
SePO
Backbone=Llama-3.2-3B,...
2026.06
15.93
ConfPO
Backbone=Llama-3.2-3B,...
2026.06
15.07
Base
Backbone=Qwen-3.5-9B,...
2026.06
14.12
TI-DPO
Backbone=Llama-3.2-3B,...
2026.06
13.9
DPO
Backbone=Llama-3.2-3B,...
2026.06
11.13
Base
Backbone=Gemma-3-4B, W...
2026.06
8.59
Base
Backbone=Llama-3.2-3B,...
2026.06
3.1
Feedback
Search any
task
Search any
task