Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
LLM Decoding on ShareGPT
Loading...
2.4
Latency (ms/token)
Llama2-7B
1.484
7.667
13.85
20.033
Mar 12, 2026
Latency (ms/token)
Peak Memory (GiB)
Updated 2mo ago
Evaluation Results
Method
Method
Links
Latency (ms/token)
Peak Memory (GiB)
Llama2-7B
Backbone=Llama2-7B
2026.03
2.4
12.9
AdaFuse
Backbone=Llama2-7B, Ad...
2026.03
3.1
13.8
PESC
Backbone=Llama2-7B, Ad...
2026.03
8.5
13.1
MoRAL
Backbone=Llama2-7B, Ad...
2026.03
8.6
13.3
MOLA
Backbone=Llama2-7B, Ad...
2026.03
25.3
26.3
Feedback
Search any
task
Search any
task