Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Inference Throughput on BERT, GPT-2, and OPT (BS=2, SL=256)

57,117.4Original Throughput (tokens/s)

BERT

41,602.26445,630.23249,658.253,686.168May 14, 2026
Updated 19d ago

Evaluation Results

MethodLinks
2026.05
57,117.458,7561,638.7
2026.05
55,351.457,787.82,436.5
2026.05
54,468.157,853.13,385
2026.05
44,32946,126.11,797.1
2026.05
43,144.944,263.91,119
2026.05
42,19943,760.71,561.7