Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Language Modeling Inference on Qwen2.5-7B (64K Context Length)

13.6Decode Latency (ms/token)

FastMKA

12.57619.48826.433.312Mar 21, 2026
Updated 25d ago

Evaluation Results

MethodLinks
2026.03
13.61.68
2026.03
22.8-
2026.03
33.4-
2026.03
39.2-