Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Inference Latency on OPT-175B first FFN layer

0.225Latency (ms)

LUT-GEMM

0.2049760.3401380.47530.610462Jun 4, 2025
Updated 4d ago

Evaluation Results

MethodLinks
2025.06
0.225
2025.06
0.2688
2025.06
0.3238
2025.06
0.3599
2025.06
0.7256