Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
General Chat on Arena-Hard Vanilla
Loading...
0.492
Win Rate
PROSPER
0.42128
0.43964
0.458
0.47636
Feb 22, 2026
Win Rate
Updated 1mo ago
Evaluation Results
Method
Method
Links
Win Rate
PROSPER
Backbone=Qwen2.5-7B-In...
2026.02
0.492
PROSPER-VB
Backbone=Qwen2.5-7B-In...
2026.02
0.476
PROSPER-JC
Backbone=Qwen2.5-7B-In...
2026.02
0.442
RLCF
Backbone=Qwen2.5-7B-In...
2026.02
0.426
Qwen2.5-7B-Instruct
Model Scale=7B, LLM Ju...
2026.02
0.424
Feedback
Search any
task
Search any
task