Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Inference Latency on VLM decode average
Loading...
21.1
Latency (ms)
W3A16
20.76
23.055
25.35
27.645
Dec 27, 2024
Latency (ms)
Updated 4d ago
Evaluation Results
Method
Method
Links
Latency (ms)
W3A16
Model=LLaVA-onevision-...
2024.12
21.1
W4A8
Model=LLaVA-onevision-...
2024.12
26.3
FP16
Model=LLaVA-onevision-...
2024.12
29.6
Feedback
Search any
task
Search any
task