Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Efficiency Analysis on Context Length 16K
Loading...
336
Theoretical Compute (TFLOPs)
Forward Pass Only
333.52
350.26
367
383.74
Mar 11, 2026
Theoretical Compute (TFLOPs)
Theoretical Memory Traffic (GB)
Theoretical TTFT (ms)
Theoretical TTFT Overhead (ms)
Empirical TTFT (ms)
Empirical TTFT Overhead (ms)
Updated 1mo ago
Evaluation Results
Method
Method
Links
Theoretical Compute (TFLOPs)
Theoretical Memory Traffic (GB)
Theoretical TTFT (ms)
Theoretical TTFT Overhead (ms)
Empirical TTFT (ms)
Empirical TTFT Overhead (ms)
Forward Pass Only
2026.03
336
13
635
-
658
-
SnapKV
2026.03
336
13
635
0.01
695
37.12
LOOKAHEADKV
2026.03
337
13
636
1.27
677
18.5
LAQ
2026.03
337
447
871
236.15
1,182
523.54
SpecKV
2026.03
398
89
792
157.05
866
207.31
Feedback
Search any
task
Search any
task