Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

The Evolved Transformer

About

Recent works have highlighted the strength of the Transformer architecture on sequence tasks while, at the same time, neural architecture search (NAS) has begun to outperform human-designed models. Our goal is to apply NAS to search for a better alternative to the Transformer. We first construct a large search space inspired by the recent advances in feed-forward sequence models and then run evolutionary architecture search with warm starting by seeding our initial population with the Transformer. To directly search on the computationally expensive WMT 2014 English-German translation task, we develop the Progressive Dynamic Hurdles method, which allows us to dynamically allocate more resources to more promising candidate models. The architecture found in our experiments -- the Evolved Transformer -- demonstrates consistent improvement over the Transformer on four well-established language tasks: WMT 2014 English-German, WMT 2014 English-French, WMT 2014 English-Czech and LM1B. At a big model size, the Evolved Transformer establishes a new state-of-the-art BLEU score of 29.8 on WMT'14 English-German; at smaller sizes, it achieves the same quality as the original "big" Transformer with 37.6% less parameters and outperforms the Transformer by 0.7 BLEU at a mobile-friendly model size of 7M parameters.

David R. So, Chen Liang, Quoc V. Le• 2019

Related benchmarks

TaskDatasetResultRank
Machine TranslationWMT En-De 2014 (test)
BLEU29.8
379
Machine TranslationWMT En-Fr 2014 (test)
BLEU41.3
237
Machine TranslationWMT English-German 2014 (test)
BLEU28.4
136
Machine TranslationWMT 2014 (test)
BLEU29.8
100
Machine TranslationWMT En-De '14
BLEU29.8
89
Machine TranslationWMT en-fr 14
BLEU Score41.3
56
Machine TranslationWMT En-De (newstest2014)
BLEU29.8
43
Machine TranslationWMT English-French 2014 (test)
BLEU41.3
41
Language ModelingC4 T5 (val)
PPLX19.84
20
Machine TranslationWMT English-German (EN-DE) 2014 (test)
BLEU Score29.8
11
Showing 10 of 14 rows

Other info

Follow for update