Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Inference Latency on OPT-175B first FFN layer
Loading...
0.225
Latency (ms)
LUT-GEMM
0.204976
0.340138
0.4753
0.610462
Jun 4, 2025
Latency (ms)
Updated 4d ago
Evaluation Results
Method
Method
Links
Latency (ms)
LUT-GEMM
Schemes=UQ, BCQ, # Bit...
2025.06
0.225
LUT-GEMM
Schemes=UQ, BCQ, # Bit...
2025.06
0.2688
AWQ
Schemes=UQ, # Bits=4,...
2025.06
0.3238
GPTQ
Schemes=UQ, # Bits=3,...
2025.06
0.3599
cuBLAS
# Bits=16, Hardware=A1...
2025.06
0.7256
Feedback
Search any
task
Search any
task