Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Language Modeling Inference on Qwen2.5-7B (32K context length)

10.3Decoding Latency (ms/token)

FastMKA

9.56814.50919.4524.391Mar 21, 2026
Updated 25d ago

Evaluation Results

MethodLinks
2026.03
10.31.59
2026.03
16.4-
2026.03
24.3-
2026.03
28.6-