| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Chatbot workload | ShareGPT | Average PTLA (s/token)0.36 | 36 | |
| Online Inference | ShareGPT | P50 Latency27 | 32 | |
| Multi-turn dialogue | ShareGPT | Success Rate (SR)94.11 | 24 | |
| Large Language Model Throughput | ShareGPT v3 | Throughput (req/s)8.85 | 24 | |
| Proactive next utterance prediction | ShareGPT (test) | LLM-Judge52.66 | 17 | |
| Speculative Decoding | ShareGPT Llama-3.1-8B 1.0 (test) | MT-Bench Score3.2124 | 10 | |
| Fine-tuning Robustness | ShareGPT | FSR9,800 | 10 | |
| Multi-turn dialogue routing | ShareGPT-LF Llama Series Set (cross-domain (legal and financial)) | Success Rate (SR)90.07 | 9 | |
| Multi-turn dialogue routing | ShareGPT-LF Qwen Series Set cross-domain (legal and financial) | Success Rate (SR)91.46 | 9 | |
| Text-to-image generation | ShareGPT-4o-Image SD3-Medium | CLIP Score35.1851 | 7 | |
| Multi-turn dialogue routing | ShareGPT Mixed candidate set (Qwen and Llama) | SR94.99 | 6 | |
| Multi-turn dialogue | ShareGPT 3 Turn 6491 tokens | PPL2.79 | 6 | |
| Multi-turn dialogue | ShareGPT 2 Turn, 3006 tokens | PPL2.91 | 6 | |
| Multi-turn dialogue | ShareGPT 1 Turn, 765 tokens | Perplexity4.01 | 6 | |
| LLM Decoding | ShareGPT | Latency (ms/token)2.4 | 5 | |
| Instruction Following | ShareGPT | MT-Bench Score3.99 | 5 |