Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
vLLM Inference Performance on Qwen3-14B
Loading...
3.082
Model Load Time (s)
Safetensors
0.73012
16.60531
32.4805
48.35569
Dec 4, 2025
Model Load Time (s)
First Token Latency (s)
Throughput (tok/s)
CPU Memory (MiB)
Accelerator Memory (GiB)
Updated 4d ago
Evaluation Results
Method
Method
Links
Model Load Time (s)
First Token Latency (s)
Throughput (tok/s)
CPU Memory (MiB)
Accelerator Memory (GiB)
Safetensors
2025.12
3.082
134.347
78.02
11,378.91
27.78
CryptoTensors
Encryption=Unencrypted
2025.12
3.165
134.765
76.64
11,387.66
27.78
CryptoTensors
Encryption=Encrypted
2025.12
61.879
136.105
78.61
11,425.65
27.78
Feedback
Search any
task
Search any
task