Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Alignment with Human Preferences on Chatbot Arena English-only
Loading...
91.67
Spearman Correlation
Auto-Arena
-19.6724
9.2338
38.14
67.0462
May 30, 2024
Spearman Correlation
Updated 4d ago
Evaluation Results
Method
Method
Links
Spearman Correlation
Auto-Arena
Peer Battles=included,...
2024.05
91.67
Auto-Arena
Committee Discussions=w/o
2024.05
88.33
Auto-Arena
Peer Battles=w/o
2024.05
86.67
Arena-Hard
2024.05
85.71
MT-Bench
2024.05
82.86
LC-AlpacaEval
2024.05
82.14
MMLU
2024.05
56.36
GPQA
2024.05
36.84
OpenLLM
2024.05
-15.39
Feedback
Search any
task
Search any
task