Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Inference Efficiency on 32k Context Length Llama-3-8B (test)

4.12Time To First Token (s)

Full Model

4.11324.15914.2054.2509Mar 15, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.03
4.120.08124.27
2026.03
4.120.03216.12
2026.03
4.120.03316.58
2026.03
4.130.03216.37
2026.03
4.250.03115.94
2026.03
4.280.03917.03
2026.03
4.290.03816.91