Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

LLM Inference Efficiency on Synthetic LLM Workload (4K Input/4K Output)

160.32Latency (s)

ChunkKV

159.7128163.8114167.91172.0086Feb 4, 2025
Updated 23d ago

Evaluation Results

MethodLinks
2025.02
160.3241.3
2025.02
162.8541.12
2025.02
163.4540.51
2025.02
175.537.73