Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Language Model Inference Efficiency on Meta-Llama-3-8B

1,398Throughput (tokens/s)

AWQ

835.984981.8921,127.81,273.708Feb 27, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.02
1,3980.715
2026.02
1,022.50.978
2026.02
975.051.025
2026.02
857.61.166