Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Chatbot Evaluation on ArenaHard
Loading...
13.88
Win Rate
AttentionPO
1.7952
4.9326
8.07
11.2074
May 21, 2026
Win Rate
Updated 12d ago
Evaluation Results
Method
Method
Links
Win Rate
AttentionPO
Reference Model=Mistra...
2026.05
13.88
DPO
Reference Model=Mistra...
2026.05
12.45
Mistral-7B-Base-SFT
Reference Model=Mistra...
2026.05
2.26
Feedback
Search any
task
Search any
task