Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
vLLM Inference Performance on Qwen3-1.7B
Loading...
0.54
Model Load Time (s)
Safetensors
0.3356
1.7153
3.095
4.4747
Dec 4, 2025
Model Load Time (s)
First Token Latency (s)
Throughput (tok/s)
CPU Mem (MiB)
Acc Mem (GiB)
Updated 4d ago
Evaluation Results
Method
Method
Links
Model Load Time (s)
First Token Latency (s)
Throughput (tok/s)
CPU Mem (MiB)
Acc Mem (GiB)
Safetensors
2025.12
0.54
101.911
198.45
10,547.66
3.25
CryptoTensors
Encryption=Unencrypted
2025.12
0.594
101.727
190.9
10,560.64
3.25
CryptoTensors
Encryption=Encrypted
2025.12
5.65
102.108
195.19
10,666.21
3.25
Feedback
Search any
task
Search any
task