Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Preference Modeling on Arena-Hard V2
Loading...
73.2
Win Rate
Nanbeige4.1-3B
-0.432
18.684
37.8
56.916
Nov 5, 2025
Nov 21, 2025
Dec 8, 2025
Dec 25, 2025
Jan 10, 2026
Jan 27, 2026
Feb 13, 2026
Win Rate
Updated 1mo ago
Evaluation Results
Method
Method
Links
Win Rate
Nanbeige4.1-3B
Parameters=3B
2026.02
73.2
Qwen3-Next-80B-A3B
Parameters=80B
2026.02
62.3
Qwen3-30B-A3B-2507
Parameters=30B
2026.02
60.2
Nanbeige4-3B-2511
Parameters=3B
2026.02
60
Qwen3-32B
Parameters=32B
2026.02
56
Qwen3-4B-2507
Parameters=4B
2026.02
34.9
JS
Backbone=Llama-3.2-3B-...
2025.11
7.4
RLOO
Backbone=Llama-3.2-3B-...
2025.11
5.6
Llama-3.2-3B-It
2025.11
2.4
Feedback
Search any
task
Search any
task