Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Language Modeling Inference on Qwen2.5-7B (16K Context Length)

8.4Decode Latency (ms/token)

FastMKA

7.8811.3914.918.41Mar 21, 2026
Updated 25d ago

Evaluation Results

MethodLinks
2026.03
8.41.52
2026.03
12.8-
2026.03
18.6-
2026.03
21.4-