Share your thoughts, 1 month free Claude Pro on usSee more

VLM Inference Latency on Qualcomm Snapdragon 8 Gen 3 SoC llama.cpp quantization

6.82VE Latency (ms/patch)

MobileVLM-336

Updated 3mo ago

Evaluation Results

Method	Links
MobileVLM-336 2023.12		6.82	34,892	34.93	21.54	18.51
LLaVA-v1.5-336 2023.12		7.77	31,370	41.7	18.4	20.7
LLaVA-v1.5-336 2023.12		7.98	27,530	8.95	7.22	84.43
LLaVA-v1.5-336 2023.12		8.23	17,347	5.36	0.25	329.89
MobileVLM-336 2023.12		8.43	27,660	18.36	12.21	33.1