Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

AdapterDrop: On the Efficiency of Adapters in Transformers

About

Massively pre-trained transformer models are computationally expensive to fine-tune, slow for inference, and have large storage requirements. Recent approaches tackle these shortcomings by training smaller models, dynamically reducing the model size, and by training light-weight adapters. In this paper, we propose AdapterDrop, removing adapters from lower transformer layers during training and inference, which incorporates concepts from all three directions. We show that AdapterDrop can dynamically reduce the computational overhead when performing inference over multiple tasks simultaneously, with minimal decrease in task performances. We further prune adapters from AdapterFusion, which improves the inference efficiency while maintaining the task performances entirely.

Andreas R\"uckl\'e, Gregor Geigle, Max Glockner, Tilman Beck, Jonas Pfeiffer, Nils Reimers, Iryna Gurevych• 2020

Related benchmarks

TaskDatasetResultRank
Natural Language UnderstandingGLUE
SST-294.7
531
Natural Language UnderstandingGLUE (val)
SST-293.6
191
Natural Language UnderstandingGLUE
COLA Score62.6
41
Visual Question AnsweringUltra-MedVQA Task 2
Accuracy76.65
26
Visual Question AnsweringUltra-MedVQA Task 5
Accuracy66.25
26
Visual Question AnsweringUltra-MedVQA Task 4
Accuracy56.46
26
Visual Question AnsweringUltra-MedVQA Task 6
Accuracy81.7
26
Visual Question AnsweringUltra-MedVQA Task 1
Accuracy34.31
26
Visual Question AnsweringUltra-MedVQA Task 3
Accuracy72.63
26
Natural Language UnderstandingSuperGLUE
MultiRC Score72.9
22
Showing 10 of 10 rows

Other info

Follow for update