Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Decoding Throughput on Llama 2 7B inference v1.0
Loading...
188
Decoding Throughput (TOK/s)
QTIP
50.616
86.283
121.95
157.617
Jun 17, 2024
Decoding Throughput (TOK/s)
Updated 4d ago
Evaluation Results
Method
Method
Links
Decoding Throughput (TOK/s)
QTIP
BITS=2, Batch size=1,...
2024.06
188
QUIP#
BITS=2, Batch size=1,...
2024.06
186
QTIP
BITS=3, Batch size=1,...
2024.06
161
QTIP
BITS=4, Batch size=1,...
2024.06
140
AQLM
BITS=2, Batch size=1,...
2024.06
81.5
FP16
BITS=16, Batch size=1,...
2024.06
55.9
Feedback
Search any
task
Search any
task