Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Human Preference Evaluation on Arena-Hard v0.1
Loading...
56.7
Win Rate
JS
24.98
33.215
41.45
49.685
Nov 5, 2025
Win Rate
Updated 4d ago
Evaluation Results
Method
Method
Links
Win Rate
JS
Backbone=Llama-3.2-3B-...
2025.11
56.7
RLOO
Backbone=Llama-3.2-3B-...
2025.11
55.3
Llama-3.2-3B-It
2025.11
26.2
Feedback
Search any
task
Search any
task