Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Inference Throughput on BERT, GPT-2, and OPT (BS=2, SL=256)
Loading...
57,117.4
Original Throughput (tokens/s)
BERT
41,602.264
45,630.232
49,658.2
53,686.168
May 14, 2026
Original Throughput (tokens/s)
New Throughput (tokens/s)
Throughput Improvement (tokens/s)
Updated 19d ago
Evaluation Results
Method
Method
Links
Original Throughput (tokens/s)
New Throughput (tokens/s)
Throughput Improvement (tokens/s)
BERT
Hardware=A100, Batch s...
2026.05
57,117.4
58,756
1,638.7
GPT-2
Hardware=A100, Batch s...
2026.05
55,351.4
57,787.8
2,436.5
OPT
Hardware=A100, Batch s...
2026.05
54,468.1
57,853.1
3,385
BERT
Hardware=V100, Batch s...
2026.05
44,329
46,126.1
1,797.1
OPT
Hardware=V100, Batch s...
2026.05
43,144.9
44,263.9
1,119
GPT-2
Hardware=V100, Batch s...
2026.05
42,199
43,760.7
1,561.7
Feedback
Search any
task
Search any
task