Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Decoding Latency on Llama-3.1-8B 64k sequence length v1

0.05Decoding Latency (s)

Full Cache

0.03460.138550.24250.34645May 26, 2025
Updated 1mo ago

Evaluation Results

MethodLinks
2025.05
0.05
2025.05
0.054
2025.05
0.074
2025.05
0.108
2025.05
0.435