Share your thoughts, 1 month free Claude Pro on usSee more

Inference Efficiency on Qwen2.5-7B

1,480.2Throughput (tokens/s)

AWQ

Updated 4mo ago

Evaluation Results

Method	Links
AWQ 2026.02		1,480.2	0.675	-
DACQ Hybrid 2026.02		1,093.1	0.915	-
DACQ Logistic 2026.02		1,035.4	0.966	-
Unquantized 2026.02		929.4	1.0759	-
FineRMoE 2026.03		27.3	-	178.3
DU 2026.03		25.6	-	84.8
S16A4 2026.03		24	-	78.5
NVShard 2026.03		18.9	-	137.8
C32A2 2026.03		0.2	-	50,245.9