Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

LLM Inference on LLaMA-7B v1 (serving)

12.16Decode Latency (ms/token)

FlashSVD v1.5

11.412816.456421.526.5436May 8, 2026
Updated 22d ago

Evaluation Results

MethodLinks
2026.05
12.160.432.55
2026.05
12.160.432.2
2026.05
12.241.722.5
2026.05
12.241.722.13
2026.05
12.851.932.38
2026.05
12.851.932.07
2026.05
14.092.392.18
2026.05
14.092.391.86
2026.05
26.193.51-
2026.05
26.213.94-
2026.05
26.523.68-
2026.05
26.90.91-
2026.05
30.64.29-
2026.05
30.634.1-
2026.05
30.654.85-
2026.05
30.841.03-