Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Decoding Latency on Llama-2-7B 16k sequence length v1 (inference)

0.041Decoding Latency (s)

TailorKV

0.006920.236960.4670.69704May 26, 2025
Updated 4d ago

Evaluation Results

MethodLinks
2025.05
0.041
2025.05
0.045
2025.05
0.067
2025.05
0.108
2025.05
0.433
2025.05
0.893