Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Decoding Latency on Llama-2-7B 64k sequence length v1 (inference)

0.098Decoding Latency (s)

TailorKV

0.031240.481870.93251.38313May 26, 2025
Updated 1mo ago

Evaluation Results

MethodLinks
2025.05
0.098
2025.05
0.114
2025.05
0.135
2025.05
0.14
2025.05
1.767