Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Language Modeling Inference on Qwen2.5-7B (256K Context Length)

26.3Decode Latency (ms/token)

FastMKA

23.84840.39956.9573.501Mar 21, 2026
Updated 25d ago

Evaluation Results

MethodLinks
2026.03
26.31.86
2026.03
48.9-
2026.03
75.2-
2026.03
87.6-