Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Language Modeling Inference on Qwen2.5-7B (4K Context Length)

6.2Decode Latency (ms/token)

FastMKA

5.888.0410.212.36Mar 21, 2026
Updated 25d ago

Evaluation Results

MethodLinks
2026.03
6.21.4
2026.03
8.7-
2026.03
12.4-
2026.03
14.2-