Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Instruction Following Evaluation on LMSYS In-Dist.
Loading...
51.8
GPT-4o Score
SODA
43.688
45.794
47.9
50.006
Apr 4, 2026
GPT-4o Score
Updated 11d ago
Evaluation Results
Method
Method
Links
GPT-4o Score
SODA
Model=Llama-3.1-8B-Ins...
2026.04
51.8
Teacher
Model=GPT-5-Chat
2026.04
51.7
SODA
Model=Qwen2.5-7B-Instruct
2026.04
51.5
GAD
Model=Qwen2.5-7B-Instruct
2026.04
50.8
GAD
Model=Llama-3.1-8B-Ins...
2026.04
50.3
SeqKD
Model=Llama-3.1-8B-Ins...
2026.04
49.7
SODA
Model=Qwen2.5-3B-Instruct
2026.04
49.2
SeqKD
Model=Qwen2.5-7B-Instruct
2026.04
49.2
SODA
Model=Llama-3.2-3B-Ins...
2026.04
49.1
GAD
Model=Qwen2.5-3B-Instruct
2026.04
48.9
Base
Model=Qwen2.5-7B-Instruct
2026.04
48.7
GAD
Model=Llama-3.2-3B-Ins...
2026.04
48.1
SeqKD
Model=Llama-3.2-3B-Ins...
2026.04
47.6
SeqKD
Model=Qwen2.5-3B-Instruct
2026.04
47.5
Base
Model=Llama-3.1-8B-Ins...
2026.04
46.9
Base
Model=Qwen2.5-3B-Instruct
2026.04
45.8
Base
Model=Llama-3.2-3B-Ins...
2026.04
44
Feedback
Search any
task
Search any
task