Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Inference Efficiency Benchmark
Loading...
60
TTFT (ms)
PaliGemma-3B
48.52
126.01
203.5
280.99
Apr 9, 2026
TTFT (ms)
Throughput (tok/s)
Memory (GB)
Efficiency (TPS/GB)
Total Time (s)
Updated 8d ago
Evaluation Results
Method
Method
Links
TTFT (ms)
Throughput (tok/s)
Memory (GB)
Efficiency (TPS/GB)
Total Time (s)
PaliGemma-3B
Hardware=NVIDIA A6000...
2026.04
60
64.3
6.02
10.7
1.554
LLaVA-1.5-7B
Hardware=NVIDIA A6000...
2026.04
93
44.2
14.51
3
2.263
PaveGPT-7B
Hardware=NVIDIA A6000...
2026.04
236
39.7
16.81
2.4
2.518
LLaMA-3.2-11B
Hardware=NVIDIA A6000...
2026.04
253
30.8
22.31
1.4
3.247
InternVL-3.5-8B
Hardware=NVIDIA A6000...
2026.04
307
28.7
17.58
1.6
3.481
LLaVA-1.6-7B
Hardware=NVIDIA A6000...
2026.04
347
39
15.71
2.5
2.566
Feedback
Search any
task
Search any
task