Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Human Preference Alignment on ArenaHard V2
Loading...
60
Avg@3 Score
Qwen3-30A3-2507
25.056
34.128
43.2
52.272
Dec 6, 2025
Avg@3 Score
Updated 4d ago
Evaluation Results
Method
Method
Links
Avg@3 Score
Qwen3-30A3-2507
Release Identifier=2507
2025.12
60
Nanbeige4-3B-Thinking
Release Identifier=2511
2025.12
60
Qwen3-32B-2504
Release Identifier=2504
2025.12
48.4
Qwen3-4B-2507
Release Identifier=2507
2025.12
40.5
Qwen3-14B-2504
Release Identifier=2504
2025.12
39.9
Qwen3-8B-2504
Release Identifier=2504
2025.12
26.4
Feedback
Search any
task
Search any
task