Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Latency Measurement on LLaMA linear layers 13B (inference)
Loading...
0.0499
Latency (ms)
MBOK
-0.157608
1.242866
2.64334
4.043814
May 28, 2025
Latency (ms)
Speed-up
Updated 1mo ago
Evaluation Results
Method
Method
Links
Latency (ms)
Speed-up
MBOK
Weight Size=13824 × 51...
2025.05
0.0499
8.7
MBOK
Weight Size=5120 × 512...
2025.05
0.0507
3.25
MBOK
Weight Size=5120 × 138...
2025.05
0.051
8.4
FP16
Weight Size=5120 × 512...
2025.05
0.1654
-
FP16
Weight Size=5120 × 138...
2025.05
0.4283
-
FP16
Weight Size=13824 × 51...
2025.05
0.4341
-
QUIP#
Weight Size=5120 × 512...
2025.05
0.6226
0.27
QUIP#
Weight Size=5120 × 138...
2025.05
0.6284
0.68
QUIP#
Weight Size=13824 × 51...
2025.05
0.6284
0.69
QTIP
Weight Size=5120 × 512...
2025.05
1.9637
0.08
QTIP
Weight Size=13824 × 51...
2025.05
5.2119
0.08
QTIP
Weight Size=5120 × 138...
2025.05
5.2368
0.09
Feedback
Search any
task
Search any
task