Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Efficiency Analysis on Context Length 32K
Loading...
928
Theoretical Compute (TFLOPs)
Forward Pass Only
920.52
971.01
1,021.5
1,071.99
Mar 11, 2026
Theoretical Compute (TFLOPs)
Theoretical Memory Traffic (GB)
Theoretical TTFT (ms)
Theoretical TTFT Overhead (ms)
Empirical TTFT (ms)
Empirical TTFT Overhead (ms)
Updated 1mo ago
Evaluation Results
Method
Method
Links
Theoretical Compute (TFLOPs)
Theoretical Memory Traffic (GB)
Theoretical TTFT (ms)
Theoretical TTFT Overhead (ms)
Empirical TTFT (ms)
Empirical TTFT Overhead (ms)
Forward Pass Only
2026.03
928
13
1,754
-
1,760
-
SnapKV
2026.03
928
13
1,754
0.01
1,838
77.67
LOOKAHEADKV
2026.03
929
13
1,755
1.74
1,798
38.04
LAQ
2026.03
930
451
1,993
239.26
2,314
553.68
SpecKV
2026.03
1,115
106
2,156
402.8
2,263
502.87
Feedback
Search any
task
Search any
task