Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Language Modeling Inference on Qwen2.5-7B 128K context length

18.4Decode Latency (ms/token)

FastMKA

16.80427.57738.3549.123Mar 21, 2026
Updated 25d ago

Evaluation Results

MethodLinks
2026.03
18.41.78
2026.03
32.7-
2026.03
49.8-
2026.03
58.3-