Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Inference Efficiency on 1x V100 (16GB) (synthetic)
Loading...
28,298
Throughput (tokens/s)
SRM
-472.56
6,996.72
14,466
21,935.28
May 9, 2026
Throughput (tokens/s)
Max Concurrency
Throughput Increase Factor
Updated 22d ago
Evaluation Results
Method
Method
Links
Throughput (tokens/s)
Max Concurrency
Throughput Increase Factor
SRM
n_ctx=1024, d_m=1024,...
2026.05
28,298
64,000
14.75
SRM
n_ctx=512, d_m=1024, n...
2026.05
28,091
64,000
9.66
SRM
n_ctx=4096, d_m=1024,...
2026.05
27,445
32,000
43.29
SRM
n_ctx=2048, d_m=1024,...
2026.05
27,441
32,000
24.59
Transformer
n_ctx=512, d_m=512, n_...
2026.05
2,908
400
-
Transformer
n_ctx=1024, d_m=512, n...
2026.05
1,918
200
-
Transformer
n_ctx=2048, d_m=512, n...
2026.05
1,116
100
-
Transformer
n_ctx=4096, d_m=512, n...
2026.05
634
50
-
Feedback
Search any
task
Search any
task