Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Inference Throughput on Llama-8B
Loading...
115.2
Throughput (Tokens/s)
GPTQ
45.832
63.841
81.85
99.859
Jan 29, 2026
Throughput (Tokens/s)
Updated 4d ago
Evaluation Results
Method
Method
Links
Throughput (Tokens/s)
GPTQ
Quantization=W4A16, Ha...
2026.01
115.2
HeRo-Q
Quantization=W4A16, Ha...
2026.01
113.1
SpinQuant
Quantization=W4A16, Ha...
2026.01
112.8
FP16
Precision=FP16, Hardwa...
2026.01
48.5
Feedback
Search any
task
Search any
task