Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
vLLM Inference Performance on Qwen3-0.6B
Loading...
0.336
Model Load Time (s)
Safetensors
0.26652
0.73551
1.2045
1.67349
Dec 4, 2025
Model Load Time (s)
First Token Latency (s)
Throughput (tok/s)
CPU Mem (MiB)
Acc Mem (GiB)
Updated 4d ago
Evaluation Results
Method
Method
Links
Model Load Time (s)
First Token Latency (s)
Throughput (tok/s)
CPU Mem (MiB)
Acc Mem (GiB)
Safetensors
2025.12
0.336
90.512
197.15
10,557.58
1.17
CryptoTensors
Encryption=Unencrypted
2025.12
0.34
91.1
195.3
10,558.89
1.17
CryptoTensors
Encryption=Encrypted
2025.12
2.073
91.062
193.92
10,613.48
1.17
Feedback
Search any
task
Search any
task