Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
LLM Alignment on AlpacaEval Length-Controlled (test)
Loading...
8.78
LC Win Rate
UNA-score (MSE)
-0.0912
2.2119
4.515
6.8181
Aug 27, 2024
LC Win Rate
Updated 23d ago
Evaluation Results
Method
Method
Links
LC Win Rate
UNA-score (MSE)
Base Model=Mistral, Fi...
2024.08
8.78
UNA-score (MSE)
Base Model=Llama, Fine...
2024.08
7.87
UNA-binary (BCE)
Base Model=Mistral, Fi...
2024.08
7.41
KTO
Base Model=Mistral, Fi...
2024.08
4.46
KTO
Base Model=Llama, Fine...
2024.08
4.17
UNA-binary (BCE)
Base Model=Llama, Fine...
2024.08
3.96
DPO
Base Model=Mistral, Fi...
2024.08
3.67
UNA-pairwise
Base Model=Mistral, Fi...
2024.08
3.67
DPO
Base Model=Llama, Fine...
2024.08
2.09
UNA-pairwise
Base Model=Llama, Fine...
2024.08
2.09
Mistral
Base Model=Mistral, Fi...
2024.08
0.31
Llama
Base Model=Llama, Fine...
2024.08
0.25
Feedback
Search any
task
Search any
task