Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Large Language Model Inference on Qwen3-0.6B (inference)
Loading...
0.255
Storage (GB)
EDGERAZOR
0.20896
0.51973
0.8305
1.14127
Apr 10, 2026
Storage (GB)
Memory (GB)
Prefilling Throughput (tokens/s)
Decoding Throughput (tokens/s)
Updated 27d ago
Evaluation Results
Method
Method
Links
Storage (GB)
Memory (GB)
Prefilling Throughput (tokens/s)
Decoding Throughput (tokens/s)
EDGERAZOR
W-A-KV=1.58-8-8, Weigh...
2026.04
0.255
0.49
665.92
292.05
EDGERAZOR
W-A-KV=1.58-8-8, Weigh...
2026.04
0.275
0.509
659.63
325.19
Llama.cpp PTQ
W-A-KV=2-8-8, Weight T...
2026.04
0.323
0.639
704.7
224.8
EDGERAZOR
W-A-KV=4-8-8, Weight T...
2026.04
0.437
0.751
1,275.25
270.25
Llama.cpp PTQ
W-A-KV=4-8-8, Weight T...
2026.04
0.451
0.767
717.05
233.47
Qwen3-0.6B
W-A-KV=16-16-16, Weigh...
2026.04
1.406
1.747
335.91
21.6
Feedback
Search any
task
Search any
task