Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Instruction Following on WildBench
Loading...
63.18
WB Score
CoDIT-Qwen3-8B
20.0304
31.2327
42.435
53.6373
Apr 15, 2026
WB Score
Updated 3d ago
Evaluation Results
Method
Method
Links
WB Score
CoDIT-Qwen3-8B
Student Model=Qwen3-8B...
2026.04
63.18
CoDIT-Gemma3
Student Model=Qwen3-8B...
2026.04
62.21
CoDIT-Qwen3-30B
Student Model=Qwen3-8B...
2026.04
60.23
WebR-Pro
Student Model=Qwen3-8B...
2026.04
54.4
CoDIT-Gemma3
Student Model=Llama-3....
2026.04
54.12
CoDIT-Qwen3-8B
Student Model=Llama-3....
2026.04
48.34
CoDIT-Qwen3-30B
Student Model=Llama-3....
2026.04
48.28
Gemma-2-LMSYS-Chat-1M-Synth
Student Model=Qwen3-8B...
2026.04
46.84
Magpie-Pro-300K-Filtered
Student Model=Qwen3-8B...
2026.04
42.91
Llama-3.1-LMSYS-Chat-1M-Synth
Student Model=Qwen3-8B...
2026.04
42.05
WebR-Basic
Student Model=Qwen3-8B...
2026.04
41.2
WebR-Pro
Student Model=Llama-3....
2026.04
37.19
Gemma-2-LMSYS-Chat-1M-Synth
Student Model=Llama-3....
2026.04
36.2
Llama-3.1-LMSYS-Chat-1M-Synth
Student Model=Llama-3....
2026.04
30.94
WildChat
Student Model=Qwen3-8B...
2026.04
29.99
Magpie-Pro-300K-Filtered
Student Model=Llama-3....
2026.04
26.85
WebR-Basic
Student Model=Llama-3....
2026.04
26.57
WildChat
Student Model=Llama-3....
2026.04
21.69
Feedback
Search any
task
Search any
task