Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

ShareGPT

Benchmarks

Task NameDataset NameSOTA ResultTrend
Chatbot workloadShareGPT
Average PTLA (s/token)0.36
36
Online InferenceShareGPT
P50 Latency27
32
Multi-turn dialogueShareGPT
Success Rate (SR)94.11
24
Large Language Model ThroughputShareGPT v3
Throughput (req/s)8.85
24
Text GenerationShareGPT
Speedup vs AR2.01
19
Proactive next utterance predictionShareGPT (test)
LLM-Judge52.66
17
Response SimilarityShareGPT
Response Similarity96.4
12
Speculative DecodingShareGPT Llama-3.1-8B 1.0 (test)
MT-Bench Score3.2124
10
Fine-tuning RobustnessShareGPT
FSR9,800
10
Multi-turn dialogue routingShareGPT-LF Llama Series Set (cross-domain (legal and financial))
Success Rate (SR)90.07
9
Multi-turn dialogue routingShareGPT-LF Qwen Series Set cross-domain (legal and financial)
Success Rate (SR)91.46
9
Text-to-image generationShareGPT-4o-Image SD3-Medium
CLIP Score35.1851
7
LLM InferenceShareGPT
Throughput (RPS)58.33
6
Multi-turn dialogue routingShareGPT Mixed candidate set (Qwen and Llama)
SR94.99
6
Multi-turn dialogueShareGPT 3 Turn 6491 tokens
PPL2.79
6
Multi-turn dialogueShareGPT 2 Turn, 3006 tokens
PPL2.91
6
Multi-turn dialogueShareGPT 1 Turn, 765 tokens
Perplexity4.01
6
Hybrid Mamba-Transformer InferenceShareGPT (trace replay)
OOM Rate2.33
5
LLM DecodingShareGPT
Latency (ms/token)2.4
5
Instruction FollowingShareGPT
MT-Bench Score3.99
5
KV cache reuse efficiencyShareGPT
Match Rate30.5
4
ConversationShareGPT
Throughput2.01
2
End-to-end LLM Inference ServingShareGPT
TPOT Speedup vs DeepGEMM1.5
2
Showing 23 of 23 rows