Share your thoughts, 1 month free Claude Pro on usSee more

LLM Generation Performance on A100 80GB (inference)

128Maximum Batch Size

Layer-Condensed KV Cache

Updated 5mo ago

Evaluation Results

Method	Links
Layer-Condensed KV Cache 2024.05		128	421.02
Layer-Condensed KV Cache 2024.05		42	315.09
Layer-Condensed KV Cache 2024.05		32	108.29
Llama 2024.05		15	141.1
Layer-Condensed KV Cache 2024.05		8	77.65
Llama 2024.05		1	14.1