Share your thoughts, 1 month free Claude Pro on usSee more

LLM Inference Efficiency on Synthetic LLM Workload (4K Input/4K Output)

160.32Latency (s)

ChunkKV

Updated 2mo ago

Evaluation Results

Method	Links
ChunkKV 2025.02		160.32	41.3
ShotKV 2025.02		162.85	41.12
SnapKV 2025.02		163.45	40.51
FullKV 2025.02		175.5	37.73