Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
LLM Serving Efficiency on Azure trace
Loading...
358
GPUs Used
Homogeneous
202
242.5
283
323.5
Apr 9, 2026
GPUs Used
Cost Savings
P99 TTFT (s)
Updated 9d ago
Evaluation Results
Method
Method
Links
GPUs Used
Cost Savings
P99 TTFT (s)
Homogeneous
Request Rate=1,000 req...
2026.04
358
-
1.82
Token-budget
Request Rate=1,000 req...
2026.04
208
41.9
1.71
Feedback
Search any
task
Search any
task