Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Inference Efficiency on LLaMA 8B 32K context length 3.1

1,115Theoretical Compute (TFLOPs)

SpecKV

920.52971.011,021.51,071.99Mar 11, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.03
1,1151062,156402.82,263503
2026.03
9304511,993239.262,314554
2026.03
929131,7551.741,79838
2026.03
928131,754-1,760-
2026.03
928131,7540.011,83878