Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Inference Latency on LLaMA 2 70B
Loading...
1,450
Latency (ms)
ROSE
1,436.36
1,528.43
1,620.5
1,712.57
Mar 6, 2026
Latency (ms)
Speedup
Updated 1mo ago
Evaluation Results
Method
Method
Links
Latency (ms)
Speedup
ROSE
Sparsity pattern=2:4,...
2026.03
1,450
1.24
SparseGPT
Sparsity pattern=2:4,...
2026.03
1,458
1.23
Dense
Sparsity pattern=dense
2026.03
1,791
-
Feedback
Search any
task
Search any
task