Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

HAT: Hardware-Aware Transformers for Efficient Natural Language Processing

About

Transformers are ubiquitous in Natural Language Processing (NLP) tasks, but they are difficult to be deployed on hardware due to the intensive computation. To enable low-latency inference on resource-constrained hardware platforms, we propose to design Hardware-Aware Transformers (HAT) with neural architecture search. We first construct a large design space with $\textit{arbitrary encoder-decoder attention}$ and $\textit{heterogeneous layers}$. Then we train a $\textit{SuperTransformer}$ that covers all candidates in the design space, and efficiently produces many $\textit{SubTransformers}$ with weight sharing. Finally, we perform an evolutionary search with a hardware latency constraint to find a specialized $\textit{SubTransformer}$ dedicated to run fast on the target hardware. Extensive experiments on four machine translation tasks demonstrate that HAT can discover efficient models for different hardware (CPU, GPU, IoT device). When running WMT'14 translation task on Raspberry Pi-4, HAT can achieve $\textbf{3}\times$ speedup, $\textbf{3.7}\times$ smaller size over baseline Transformer; $\textbf{2.7}\times$ speedup, $\textbf{3.6}\times$ smaller size over Evolved Transformer with $\textbf{12,041}\times$ less search cost and no performance loss. HAT code is https://github.com/mit-han-lab/hardware-aware-transformers.git

Hanrui Wang, Zhanghao Wu, Zhijian Liu, Han Cai, Ligeng Zhu, Chuang Gan, Song Han• 2020

Related benchmarks

TaskDatasetResultRank
Machine TranslationWMT En-De 2014 (test)
BLEU28.5
379
Machine TranslationWMT En-Fr 2014 (test)
BLEU41.8
237
Machine TranslationIWSLT De-En 2014 (test)
BLEU34.5
146
Machine TranslationWMT En-De 2019 (test)
SacreBLEU42.9
37
Performance PredictionWMT En-De 2019 (val)
MAE0.91
16
Performance PredictionWMT En-De 2014 (val)
MAE1.14
16
Performance PredictionWMT En-Fr 2014 (val)
MAE1.59
16
Performance PredictionWMT Benchmarks Average WMT'14 & WMT'19 (aggregation)
MAE1.21
16
Performance PredictionAmericasNLP Bribri to Spanish 2023 (test)
MAE0.28
4
Performance PredictionAmericasNLP Chatino to Spanish 2023 (test)
MAE1.55
4
Showing 10 of 13 rows

Other info

Code

Follow for update