Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Inference Throughput on H200 GPU
Loading...
13,890
Throughput (2k Input)
Surefire-1B
7,229.84
8,958.92
10,688
12,417.08
Oct 21, 2025
Throughput (2k Input)
Throughput (4k Input)
Updated 1mo ago
Evaluation Results
Method
Method
Links
Throughput (2k Input)
Throughput (4k Input)
Surefire-1B
nlayers=16, dmodel=256...
2025.10
13,890
11,283
LLaMA-3.2-1B
nlayers=16, dmodel=204...
2025.10
11,948
9,306
LLaMA-3.2-1B-HF
nlayers=16, dmodel=204...
2025.10
11,948
9,306
Panda-1B
nlayers=16, dmodel=256...
2025.10
8,961
6,218
OLMo-2-1B-HF
nlayers=16, dmodel=204...
2025.10
7,486
-
Feedback
Search any
task
Search any
task