Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Inference Efficiency on MoE LLMs DSV2-16B, QW3-30B, QW3-80B-I

12.46Decode Speed (tokens/sec)

BITSMOE

1.21764.13637.0559.9737May 22, 2026
Updated 1d ago

Evaluation Results

MethodLinks
2026.05
12.460.64---5.085.44
2026.05
10.390.47-29.510.6927.65-
2026.05
7.431.27-----
2026.05
5.711.51---8.586.29
2026.05
5.011.06---21.986.56
2026.05
3.252.94-----
2026.05
3.072.35-56.951.6954-
2026.05
2.597.32-----
2026.05
1.658.35-148.690.61144.28-