Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

RAEE: A Robust Retrieval-Augmented Early Exit Framework for Efficient Inference

About

Deploying large language model inference remains challenging due to their high computational overhead. Early exit optimizes model inference by adaptively reducing the number of inference layers. Current methods typically train internal classifiers or use heuristic methods to determine the exit layer. However, those methods either introduce significant training overheads or lead to performance degradation. To address these limitations, this paper proposes RAEE, a robust Retrieval-Augmented Early Exit framework that not only enables early exit but also enhances model performance through corrective exit information at intermediate layers. This paper first demonstrates that the early exit problem can be effectively modeled as a distribution prediction problem, in which the distribution can be further approximated through the exit information of similar data. Subsequently, this paper introduces the process of collecting exit information of correct predictions and the steps to construct the retrieval database. Finally, leveraging the pre-constructed retrieval database, RAEE utilizes the exit information from retrieved similar data to guide the backbone model's exit. Experimental results demonstrate that RAEE can not only accelerate inference while achieving robust zero-shot performance across eight downstream tasks.

Lianming Huang, Shangyu Wu, Yufei Cui, Ying Xiong, Haibo Hu, Xue Liu, Tei-Wei Kuo, Nan Guan, Chun Jason Xue• 2024

Related benchmarks

TaskDatasetResultRank
Subjectivity ClassificationSubj
Accuracy90.15
329
Question ClassificationTREC
Accuracy62.4
259
Sentiment AnalysisMR
Accuracy0.8155
160
Sentiment AnalysisCR
Accuracy68.05
141
Sentiment AnalysisSST-5
Accuracy35.25
106
Sentiment ClassificationMPQA
Accuracy78.55
35
Text SummarizationCNN/DailyMail
ROUGE-L14.01
2
Text SummarizationXsum
ROUGE-L7.15
2
Showing 8 of 8 rows

Other info

Follow for update