| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Online Inference | ShareGPT | P50 Latency27 | 32 | |
| Large Language Model Throughput | ShareGPT v3 | Throughput (req/s)8.85 | 24 | |
| Proactive next utterance prediction | ShareGPT (test) | LLM-Judge52.66 | 17 | |
| Speculative Decoding | ShareGPT Llama-3.1-8B 1.0 (test) | MT-Bench Score3.2124 | 10 | |
| Fine-tuning Robustness | ShareGPT | FSR9,800 | 10 | |
| Multi-turn dialogue | ShareGPT 3 Turn 6491 tokens | PPL2.79 | 6 | |
| Multi-turn dialogue | ShareGPT 2 Turn, 3006 tokens | PPL2.91 | 6 | |
| Multi-turn dialogue | ShareGPT 1 Turn, 765 tokens | Perplexity4.01 | 6 | |
| Instruction Following | ShareGPT | MT-Bench Score3.99 | 5 |