Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
LLM Generation Performance on A100 80GB (inference)
Loading...
128
Maximum Batch Size
Layer-Condensed KV Cache
-4.08
30.21
64.5
98.79
May 17, 2024
Maximum Batch Size
Throughput (tokens/s)
Updated 1mo ago
Evaluation Results
Method
Method
Links
Maximum Batch Size
Throughput (tokens/s)
Layer-Condensed KV Cache
Model Size=7B, Seq. Le...
2024.05
128
421.02
Layer-Condensed KV Cache
Model Size=7B, Seq. Le...
2024.05
42
315.09
Layer-Condensed KV Cache
Model Size=30B, Seq. L...
2024.05
32
108.29
Llama
Model Size=7B, Seq. Le...
2024.05
15
141.1
Layer-Condensed KV Cache
Model Size=30B, Seq. L...
2024.05
8
77.65
Llama
Model Size=30B, Seq. L...
2024.05
1
14.1
Feedback
Search any
task
Search any
task