Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Towards Efficient NLP: A Standard Evaluation and A Strong Baseline

About

Supersized pre-trained language models have pushed the accuracy of various natural language processing (NLP) tasks to a new state-of-the-art (SOTA). Rather than pursuing the reachless SOTA accuracy, more and more researchers start paying attention on model efficiency and usability. Different from accuracy, the metric for efficiency varies across different studies, making them hard to be fairly compared. To that end, this work presents ELUE (Efficient Language Understanding Evaluation), a standard evaluation, and a public leaderboard for efficient NLP models. ELUE is dedicated to depict the Pareto Frontier for various language understanding tasks, such that it can tell whether and how much a method achieves Pareto improvement. Along with the benchmark, we also release a strong baseline, ElasticBERT, which allows BERT to exit at any layer in both static and dynamic ways. We demonstrate the ElasticBERT, despite its simplicity, outperforms or performs on par with SOTA compressed and early exiting models. With ElasticBERT, the proposed ELUE has a strong Pareto Frontier and makes a better evaluation for efficient NLP models.

Xiangyang Liu, Tianxiang Sun, Junliang He, Jiawen Wu, Lingling Wu, Xinyu Zhang, Hao Jiang, Zhao Cao, Xuanjing Huang, Xipeng Qiu• 2021

Related benchmarks

TaskDatasetResultRank
Natural Language InferenceSNLI (test)
Accuracy-2.7
681
Sentiment AnalysisIMDB (test)
Accuracy-2.5
248
Natural Language InferenceSciTail (test)
Accuracy-0.1
86
Paraphrase DetectionQQP (test)
Accuracy-0.2
51
Sentiment AnalysisYelp (test)
Accuracy-2.1
29
Natural Language InferenceSciTail source: MRPC (test)
Accuracy0.00e+0
12
Natural Language InferenceSNLI source: MNLI (test)
Accuracy-1.3
12
Paraphrase DetectionQQP source: RTE (test)
Accuracy-0.3
12
Sentiment ClassificationSST-IMDb
Accuracy-0.027
12
Sentiment ClassificationSST-Yelp
Accuracy-2.6
12
Showing 10 of 11 rows

Other info

Follow for update