Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
End-to-end decode throughput on 8K Context Length
Loading...
268.5
Throughput (tok/s)
Llama-3.1-8B
146.924
178.487
210.05
241.613
May 13, 2026
Throughput (tok/s)
Updated 14d ago
Evaluation Results
Method
Method
Links
Throughput (tok/s)
Llama-3.1-8B
KV Representation=SphKV
2026.05
268.5
GPT-oss
KV Representation=SphKV
2026.05
261.2
Qwen2.5-14B
KV Representation=SphKV
2026.05
239.4
Llama-3.1-8B
KV Representation=Dense
2026.05
173.4
GPT-oss
KV Representation=Dense
2026.05
163.7
Qwen2.5-14B
KV Representation=Dense
2026.05
151.6
Feedback
Search any
task
Search any
task