Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
System Performance Evaluation on Multi-LLM Serving Workload
Loading...
96.4
TTFT (ms)
Full-FT
95.508
101.529
107.55
113.571
Mar 3, 2026
TTFT (ms)
TPOT (ms)
TPUT (tok/s)
Updated 1mo ago
Evaluation Results
Method
Method
Links
TTFT (ms)
TPOT (ms)
TPUT (tok/s)
Full-FT
Model=Qwen3-8B-Base, #...
2026.03
96.4
12.2
81.3
QSUN
Model=Qwen3-8B-Base, #...
2026.03
97.1
6.6
147.9
QSUN
Model=LLaMA3.1-8B, # B...
2026.03
98.9
7.6
130.2
Full-FT
Model=LLaMA3.1-8B, # B...
2026.03
99.1
13.8
73.3
AWQ
Model=Qwen3-8B-Base, #...
2026.03
115.8
6.6
148.9
AWQ
Model=LLaMA3.1-8B, # B...
2026.03
118.7
7.6
130.8
Feedback
Search any
task
Search any
task