Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Large Language Model Serving on vLLM benchmark (128 prompts, 32 pre-fill, 256 generation tokens)

76TTFT (ms)

DeInfer

-790.565,058.7210,90816,757.28Apr 20, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.04
7639
2026.04
7768
2026.04
8159
2026.04
8276
2026.04
8379
2026.04
8374
2026.04
6,999311
2026.04
19,341764
2026.04
21,740812