Share your thoughts, 1 month free Claude Pro on usSee more

Language Model Inference Efficiency on Meta-Llama-3-8B

1,398Throughput (tokens/s)

AWQ

Updated 3mo ago

Evaluation Results

Method	Links
AWQ 2026.02		1,398	0.715
DACQ Hybrid 2026.02		1,022.5	0.978
DACQ Logistic 2026.02		975.05	1.025
Unquantized 2026.02		857.6	1.166