Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Inference Latency on Qwen 30B on M2 Ultra
Loading...
0.572
Wall Time (s)
NPUMoE
0.54912
0.70356
0.858
1.01244
Apr 20, 2026
Wall Time (s)
Expansion Time (s)
Attention Time (s)
Speedup (x)
Updated 1mo ago
Evaluation Results
Method
Method
Links
Wall Time (s)
Expansion Time (s)
Attention Time (s)
Speedup (x)
NPUMoE
Mode=ours
2026.04
0.572
0.464
0.086
1
coreml
Mode=coreml
2026.04
1.139
0.978
0.095
1.99
anemll
Mode=anemll
2026.04
1.144
0.988
0.09
2
Feedback
Search any
task
Search any
task