Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Inference Throughput on Llama-3B
Loading...
215.6
Throughput (TOK/s)
GPTQ
114.512
140.756
167
193.244
Jan 29, 2026
Throughput (TOK/s)
Updated 4d ago
Evaluation Results
Method
Method
Links
Throughput (TOK/s)
GPTQ
Quantization=W4A16, Ha...
2026.01
215.6
HeRo-Q
Quantization=W4A16, Ha...
2026.01
210
SpinQuant
Quantization=W4A16, Ha...
2026.01
209.5
FP16
Precision=FP16, Hardwa...
2026.01
118.4
Feedback
Search any
task
Search any
task