Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Large Language Model Inference on Decode Phase BS=1

0.152Latency (s)

BWTA (Bitnet-b1.58-2B)

0.01780.923651.82952.73535Apr 5, 2026
Updated 11d ago

Evaluation Results

MethodLinks
2026.04
0.152375.4
2026.04
0.156365.9
2026.04
0.158360.2
2026.04
0.285344.4
2026.04
0.29338
2026.04
0.295331
2026.04
0.443345.8
2026.04
0.449338.4
2026.04
0.458333.9
2026.04
0.512111.6
2026.04
0.98950.8
2026.04
1.14687.3
2026.04
1.31743.4
2026.04
1.8683.3
2026.04
1.95951
2026.04
2.56339
2026.04
2.93252.9
2026.04
3.50744.2