Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Language Modeling Inference on Qwen2.5-7B (8K context length)

7.1Decode Latency (ms/token)

FastMKA

6.7129.33111.9514.569Mar 21, 2026
Updated 25d ago

Evaluation Results

MethodLinks
2026.03
7.11.44
2026.03
10.2-
2026.03
14.8-
2026.03
16.8-