Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Large Language Model Inference on Qwen2.5-7B (test)
Loading...
37.29
Throughput
TAQ-IS
3.0844
11.9647
20.845
29.7253
Nov 9, 2025
Throughput
Latency
Throughput Change vs. FP16
Latency Change vs. FP16
Updated 15d ago
Evaluation Results
Method
Method
Links
Throughput
Latency
Throughput Change vs. FP16
Latency Change vs. FP16
TAQ-IS
Batch size=1, Hardware...
2025.11
37.29
26.82
35
26
TAQ-KL
Batch size=1, Hardware...
2025.11
32.83
30.48
19
16
UNIFORM4
Batch size=1, Hardware...
2025.11
28.71
34.83
4
4
TAQ-O
Batch size=1, Hardware...
2025.11
27.86
35.93
1
1
FP16
Batch size=1, Hardware...
2025.11
27.63
36.25
-
-
AWQ-Int4
Batch size=1, Hardware...
2025.11
16.12
62.51
42
72
GPTQ-Int4
Batch size=1, Hardware...
2025.11
4.4
227.26
84
527
Feedback
Search any
task
Search any
task