Share your thoughts, 1 month free Claude Pro on usSee more

Inference Efficiency on MoE LLMs DSV2-16B, QW3-30B, QW3-80B-I

12.46Decode Speed (tokens/sec)

BITSMOE

Updated 1mo ago

Evaluation Results

Method	Links
BITSMOE 2026.05		12.46	0.64	-	-	-	5.08	5.44
FP16 2026.05		10.39	0.47	-	29.51	0.69	27.65	-
GPTQ 2026.05		7.43	1.27	-	-	-	-	-
BITSMOE 2026.05		5.71	1.51	-	-	-	8.58	6.29
BITSMOE 2026.05		5.01	1.06	-	-	-	21.98	6.56
GPTQ 2026.05		3.25	2.94	-	-	-	-	-
FP16 2026.05		3.07	2.35	-	56.95	1.69	54	-
GPTQ 2026.05		2.59	7.32	-	-	-	-	-
FP16 2026.05		1.65	8.35	-	148.69	0.61	144.28	-