| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| INSCIT | UniConv | F133.2 | 15 | 1mo ago | |
| QReCC | ChatR1-7b | F1 Score31 | 15 | 1mo ago | |
| TopiOCQA | ChatR1-7b | F1 Score30.6 | 15 | 1mo ago | |
| PRIST 1.0 (test) | OMG-LLaVA | BLEU-40.1121 | 13 | 3mo ago | |
| FaithDial | ChatR1-3b | F1 Score19.2 | 12 | 1mo ago | |
| MD2Dial | ChatR1-7b | F1 Score31.2 | 11 | 1mo ago | |
| QReCC (test) | ChatRetriever + Mistral | F1 Score26.3 | 10 | 3mo ago | |
| TopiOCQA (test) | UniConv | F1 Score0.296 | 10 | 3mo ago | |
| Reddit (test) | Dist-10.947 | 9 | 3mo ago | ||
| INSCIT (test) | UniConv | F1 Score33.2 | 9 | 3mo ago | |
| OR-QUAC (test) | F1 Score17.8 | 9 | 3mo ago | ||
| ReDial (test) | CR-Walker | Fluency2.6 | 7 | 3mo ago | |
| ReDial | STARCRS | Fluency82 | 6 | 3mo ago | |
| Bitext Retail Banking LLM Chatbot (test) | SFT Model | BLEU26.85 | 5 | 3mo ago | |
| Cornell Movie Dialog 110K Data | PALM | Perplexity21.98 | 4 | 3mo ago | |
| Cornell Movie Dialog 10K Data | PALM | Perplexity45.43 | 4 | 3mo ago | |
| DiscussLLM (test) | Decoupled | Response Perplexity2.54 | 2 | 16d ago |