Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Conversational Instruction Following on UltraChat
Loading...
9.1
Overall Score
EPI
8.06
8.33
8.6
8.87
Apr 15, 2026
Overall Score
Updated 3d ago
Evaluation Results
Method
Method
Links
Overall Score
EPI
Base Model=Gemma-2-9B,...
2026.04
9.1
Static Isolation
Base Model=Gemma-2-9B,...
2026.04
8.8
EPI
Base Model=LLaMA-3-8B,...
2026.04
8.6
Full SFT
Base Model=Gemma-2-9B,...
2026.04
8.5
Static Isolation
Base Model=LLaMA-3-8B,...
2026.04
8.3
Full SFT
Base Model=LLaMA-3-8B,...
2026.04
8.1
Feedback
Search any
task
Search any
task