Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Inference Latency Measurement on LLaMA-13B 5120 × 5120 linear layer
Loading...
0.0507
Latency (ms)
MBOK
-0.025778
0.490716
1.00721
1.523704
May 28, 2025
Latency (ms)
Speed-up
Updated 1mo ago
Evaluation Results
Method
Method
Links
Latency (ms)
Speed-up
MBOK
Batch Size=1, Hardware...
2025.05
0.0507
3.25
FP16
Batch Size=1, Hardware...
2025.05
0.1654
-
QUIP#
Batch Size=1, Hardware...
2025.05
0.6226
0.27
QTIP
Batch Size=1, Hardware...
2025.05
1.9637
0.08
Feedback
Search any
task
Search any
task