Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Inference Speed Evaluation on generation 128 tokens
Loading...
2,747.88
Inference Time (ms)
BlockPruner
2,566.366
3,791.5855
5,016.805
6,242.0245
Jun 15, 2024
Inference Time (ms)
Speedup Factor
Updated 3mo ago
Evaluation Results
Method
Method
Links
Inference Time (ms)
Speedup Factor
BlockPruner
Backbone=Llama2-7B, Pr...
2024.06
2,747.88
1.47
ShortGPT
Backbone=Llama2-7B, Pr...
2024.06
3,094.36
1.31
SliceGPT
Backbone=Llama2-7B, Pr...
2024.06
3,226.68
1.25
BlockPruner
Backbone=Llama2-13B, P...
2024.06
3,873.2
1.88
Original
Backbone=Llama2-7B, Pr...
2024.06
4,044.3
1
SliceGPT
Backbone=Llama2-13B, P...
2024.06
4,099.08
1.78
ShortGPT
Backbone=Llama2-13B, P...
2024.06
4,111.65
1.77
Original
Backbone=Llama2-13B, P...
2024.06
7,285.73
1
Feedback
Search any
task
Search any
task