Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Inference Latency on VLM decode average
Loading...
21.1
Latency (ms)
W3A16
20.76
23.055
25.35
27.645
Dec 27, 2024
Latency (ms)
Updated 1mo ago
Evaluation Results
Method
Method
Links
Latency (ms)
W3A16
Model=LLaVA-onevision-...
2024.12
21.1
W4A8
Model=LLaVA-onevision-...
2024.12
26.3
FP16
Model=LLaVA-onevision-...
2024.12
29.6
Feedback
Search any
task
Search any
task