SOTA LLM Inference Efficiency on Synthetic LLM Workload Input 8192 Output 4096 and PapersWithCode

162.78Latency (s)

ShotKV

Updated 2mo ago

Evaluation Results

Method	Links
ShotKV 2025.02		162.78	63.24
FullKV 2025.02		183.42	55.93