Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Language Model Inference on Sequence Bucket Long
Loading...
69.58
Latency (ms)
COREY
68.8216
73.9408
79.06
84.1792
Apr 12, 2026
Latency (ms)
Throughput (k tok/s)
DRAM Usage (B/token)
Updated 4d ago
Evaluation Results
Method
Method
Links
Latency (ms)
Throughput (k tok/s)
DRAM Usage (B/token)
COREY
Precision=FP16
2026.04
69.58
350.3
241
Static Fusion
Precision=FP16
2026.04
73.61
328.3
254
No Fusion
Precision=FP16
2026.04
88.54
273.1
305
Feedback
Search any
task
Search any
task