Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Inference Latency Measurement on LLaMA 5120 × 13824 linear layer 13B
Loading...
0.051
Latency (ms)
MBOK
-0.156453
1.243721
2.643895
4.044069
May 28, 2025
Latency (ms)
Speed-up
Updated 1mo ago
Evaluation Results
Method
Method
Links
Latency (ms)
Speed-up
MBOK
Batch Size=1, Hardware...
2025.05
0.051
8.4
FP16
Batch Size=1, Hardware...
2025.05
0.4283
-
QUIP#
Batch Size=1, Hardware...
2025.05
0.6284
0.68
QTIP
Batch Size=1, Hardware...
2025.05
5.2368
0.09
Feedback
Search any
task
Search any
task