Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Decoding Latency on Llama-2-7B 96k sequence length v1 (inference)

0.115Decoding Latency (s)

PQCache

-0.010520.836741.6842.53126May 26, 2025
Updated 4d ago

Evaluation Results

MethodLinks
2025.05
0.115
2025.05
0.132
2025.05
0.176
2025.05
3.253