Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
VLM Inference Latency on Jetson Orin llama.cpp quantization
Loading...
2.89
VE Latency (ms/patch)
LLaVA-v1.5-336
2.8728
2.9889
3.105
3.2211
Dec 28, 2023
VE Latency (ms/patch)
Sample Throughput (tokens/s)
Prompt Eval Throughput (tokens/s)
Generation Throughput (tokens/s)
Total Time (s)
Updated 3d ago
Evaluation Results
Method
Method
Links
VE Latency (ms/patch)
Sample Throughput (tokens/s)
Prompt Eval Throughput (tokens/s)
Generation Throughput (tokens/s)
Total Time (s)
LLaVA-v1.5-336
Language Model=Vicuna...
2023.12
2.89
9,281
367.26
17.74
19.75
LLaVA-v1.5-336
Language Model=OpenLLa...
2023.12
2.94
22,270
474.49
30.66
12.52
LLaVA-v1.5-336
Language Model=TinyLLa...
2023.12
2.98
24,655
1,253.94
76.63
5.9
MobileVLM-336
Language Model=MobileL...
2023.12
3.11
15,678
440.6
38.34
8.31
MobileVLM-336
Language Model=MobileL...
2023.12
3.32
17,712
667.69
65.27
5.14
Feedback
Search any
task
Search any
task