Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
LLM Decoding on LLaMA 8K context 3.1-8B
Loading...
60.9
Dense Latency (ms)
Dense
57.855
59.3775
60.9
62.4225
May 20, 2026
Dense Latency (ms)
Cert Latency (ms)
Latency Ratio (Cert/Dense)
Mean KV Cache Usage (K*)
Cache Hit Rate
H2D Memory Transfer (MB)
Updated 13d ago
Evaluation Results
Method
Method
Links
Dense Latency (ms)
Cert Latency (ms)
Latency Ratio (Cert/Dense)
Mean KV Cache Usage (K*)
Cache Hit Rate
H2D Memory Transfer (MB)
Dense
Model=LLaMA 3.1-8B, Ha...
2026.05
60.9
-
-
-
-
-
Runtime-Certified Bounded-Error Quantized Attention
Model=LLaMA 3.1-8B, Ha...
2026.05
-
166.3
2.73
159
100
0
Feedback
Search any
task
Search any
task