Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

LLM Generation Performance on A100 80GB (inference)

128Maximum Batch Size

Layer-Condensed KV Cache

-4.0830.2164.598.79May 17, 2024
Updated 1mo ago

Evaluation Results

MethodLinks
2024.05
128421.02
2024.05
42315.09
2024.05
32108.29
2024.05
15141.1
2024.05
877.65
2024.05
114.1