Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Decoding Latency on Llama-2-7B 32k sequence length v1 (inference)

0.062Decoding Latency (s)

TailorKV

-0.006560.456220.9191.38178May 26, 2025
Updated 4d ago

Evaluation Results

MethodLinks
2025.05
0.062
2025.05
0.077
2025.05
0.087
2025.05
0.111
2025.05
0.838
2025.05
1.776