| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| MT-bench | CORAL | Kendall's Tau5.25 | 54 | 4d ago | |
| MT-Bench | DOUBLE | Speedup4.1 | 47 | 4d ago | |
| MTBench101 | Score9.03 | 33 | 4d ago | ||
| TSEData | ChatAD-Mistral-7B | Accuracy96.46 | 13 | 4d ago | |
| MT-Bench (MTB) | Speedup Factor2.53 | 8 | 4d ago | ||
| NPC-Chat (test) | AT-GRPO | Fluency3.84 | 8 | 4d ago | |
| ACEBench En | MT Accuracy68 | 7 | 4d ago | ||
| Honor-Dialogue | DVPO | Life Services Domain Performance88.13 | 6 | 4d ago | |
| ShareGPT 3 Turn 6491 tokens | AdmTree | PPL2.79 | 6 | 4d ago | |
| ShareGPT 2 Turn, 3006 tokens | AdmTree | PPL2.91 | 6 | 4d ago | |
| ShareGPT 1 Turn, 765 tokens | AdmTree | Perplexity4.01 | 6 | 4d ago |