Inference Latency on VLM prefill 512 tokens

59.4Prefill Latency (ms)

W4A8

Updated 3mo ago

Evaluation Results

Method	Links
W4A8 2024.12		59.4
FP16 2024.12		68.8