Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
LLM Decoding on LLaMA 64K context 3.1-8B
Loading...
62.6
Dense Latency (ms)
Dense
59.47
61.035
62.6
64.165
May 20, 2026
Dense Latency (ms)
Cert Latency (ms)
Decoding Ratio
Mean K* Value
Cache Hit Rate
Host-to-Device Memory Transfer (MB)
Updated 13d ago
Evaluation Results
Method
Method
Links
Dense Latency (ms)
Cert Latency (ms)
Decoding Ratio
Mean K* Value
Cache Hit Rate
Host-to-Device Memory Transfer (MB)
Dense
Model=LLaMA 3.1-8B, Ha...
2026.05
62.6
-
-
-
-
-
Runtime-Certified Bounded-Error Quantized Attention
Model=LLaMA 3.1-8B, Ha...
2026.05
-
257.4
4.11
222
99.6
251
Feedback
Search any
task
Search any
task