Share your thoughts, 1 month free Claude Pro on usSee more

Inference Throughput on Llama-3B

215.6Throughput (TOK/s)

GPTQ

Updated 5mo ago

Evaluation Results

Method	Links
GPTQ 2026.01		215.6
HeRo-Q 2026.01		210
SpinQuant 2026.01		209.5
FP16 2026.01		118.4