Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
General Chat on Arena Hard v0.1
Loading...
46.5
Win Rate
PMLE
26.012
31.331
36.65
41.969
Jun 2, 2025
Win Rate
Updated 14d ago
Evaluation Results
Method
Method
Links
Win Rate
PMLE
Backbone=LLaMA-3-8B-In...
2025.06
46.5
DPO
Backbone=LLaMA-3-8B-In...
2025.06
45.1
Pref. Distill.
Backbone=LLaMA-3-8B-In...
2025.06
41.9
REBEL
Backbone=LLaMA-3-8B-In...
2025.06
39.8
Base (LLaMA-3-8B-Instruct)
Model Type=Instruct-on...
2025.06
26.8
Feedback
Search any
task
Search any
task