Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Decoding Latency on Llama-2-7B 32k sequence length v1 (inference)

0.062Decoding Latency (s)

TailorKV

-0.006560.456220.9191.38178May 26, 2025
Updated 1mo ago

Evaluation Results

MethodLinks
2025.05
0.062
2025.05
0.077
2025.05
0.087
2025.05
0.111
2025.05
0.838
2025.05
1.776