Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Interactive Serving on Llama 3.1-8B-Instruct (512->128 tokens, concurrency=1)

269Throughput (Tok/s)

Pre-comp. clustering rebuild

3.872.65141.5210.35Mar 18, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.03
269173.611860.873.1
2026.03
25918.63.741880.836.1
2026.03
15725.66.212320.516.9
2026.03
15525.16.32660.515
2026.03
3413528.21660.113
2026.03
14373671770.056.5