SOTA Decoding Latency on Llama-2-7B 64k sequence length v1 (inference) and PapersWithCode

0.098Decoding Latency (s)

TailorKV

Updated 5mo ago

Evaluation Results