Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Decoding Throughput on Llama 2 70B v1.0 (inference)
Loading...
23.5
Throughput (TOK/s)
QTIP
8.1912
12.1656
16.14
20.1144
Jun 17, 2024
Throughput (TOK/s)
Updated 4d ago
Evaluation Results
Method
Method
Links
Throughput (TOK/s)
QTIP
BITS=2, Batch size=1,...
2024.06
23.5
QUIP#
BITS=2, Batch size=1,...
2024.06
22.2
QTIP
BITS=3, Batch size=1,...
2024.06
19.1
QTIP
BITS=4, Batch size=1,...
2024.06
16.3
AQLM
BITS=2, Batch size=1,...
2024.06
8.78
Feedback
Search any
task
Search any
task