Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Generative Recommendation on MT
Loading...
11.7
Latency (ms)
MTServe
8.872
27.961
47.05
66.139
Apr 24, 2026
Latency (ms)
Speedup (vs. RE)
GPU Hit Ratio
Total Hit Ratio
Updated 1mo ago
Evaluation Results
Method
Method
Links
Latency (ms)
Speedup (vs. RE)
GPU Hit Ratio
Total Hit Ratio
MTServe
Batch Size=1
2026.04
11.7
1.21
60.26
96.73
GPU-Only
Batch Size=1
2026.04
12.8
1.1
60.26
60.26
Recomp.
Batch Size=1
2026.04
14.1
1
-
-
MTServe
Batch Size=4
2026.04
17.3
2.52
60.69
97.44
MTServe
Batch Size=8
2026.04
26.6
3.1
61.73
98.71
GPU-Only
Batch Size=4
2026.04
28.9
1.51
60.69
60.69
Recomp.
Batch Size=4
2026.04
43.6
1
-
-
GPU-Only
Batch Size=8
2026.04
48.6
1.7
61.73
61.73
Recomp.
Batch Size=8
2026.04
82.4
1
-
-
Feedback
Search any
task
Search any
task