Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Large Language Model Inference on Llama 3.2 1B
Loading...
1.94
TPOTH
Baseline
0.3384
0.7542
1.17
1.5858
Mar 15, 2026
TPOTH
TPOT (BF16)
TPOT (INT4)
Updated 1mo ago
Evaluation Results
Method
Method
Links
TPOTH
TPOT (BF16)
TPOT (INT4)
Baseline
Backbone=Llama-3.2-1B
2026.03
1.94
7.69
3.6
Vocab. Trimming
Backbone=Llama-3.2-1B
2026.03
1.07
6.82
2.73
SVDSoftmax
Backbone=Llama-3.2-1B
2026.03
0.61
6.36
2.27
FlashHead
Backbone=Llama-3.2-1B
2026.03
0.4
6.15
2.06
Feedback
Search any
task
Search any
task