Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Inference Efficiency on 1x H100 96GB GPU (synthetic)
Loading...
161,312
Throughput
SRM
-2,527.52
40,007.74
82,543
125,078.26
May 9, 2026
Throughput
Concurrency
Updated 22d ago
Evaluation Results
Method
Method
Links
Throughput
Concurrency
SRM
Implementation=Pytorch...
2026.05
161,312
512,000
Mamba
Implementation=Pytorch...
2026.05
61,465
1,000
Transformer
Implementation=Pytorch...
2026.05
32,134
6,000
Mamba
Implementation=Pytorch...
2026.05
23,155
1,000
Transformer
Implementation=Pytorch...
2026.05
15,272
4,000
RWKV
Implementation=Pytorch...
2026.05
3,820
1,024
RWKV
Implementation=Pytorch...
2026.05
3,774
1,024
Feedback
Search any
task
Search any
task