| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| MT-Bench (test) | LoRA | GPT-4 Score8.36 | 46 | 4d ago | |
| 4 dialogue tasks (Skill Talk, Empathetic Dialogues, Wizard of Internet, Wizard of Wikipedia) (test) | F1 Score13.7 | 24 | 4d ago | ||
| Dialogue | HBAT | PandaLM77.79 | 18 | 4d ago | |
| Anthropic-HH (distillation set) | Response Word Count73.53 | 16 | 4d ago | ||
| DailyDialog | GPT2-tree | R-114.99 | 10 | 4d ago | |
| WoW | MindRef | F1 Score14.77 | 8 | 4d ago | |
| GROWOVER-DIALOGUE (NEW) | RiLM | BLEU (Month 9)5.36 | 6 | 2d ago | |
| GROWOVER-DIALOGUE (UNCHANGED) | RiLM | BLEU (Month 9)4.68 | 6 | 2d ago | |
| Dialogue (test) | Fluency8.84 | 5 | 4d ago | ||
| Dialogue dataset | M-RAG | BLEU-124.52 | 4 | 4d ago |