Efficient Knowledge Transfer in Multi-Task Learning through Task-Adaptive Low-Rank Representation
About
Pre-trained language models (PLMs) demonstrate remarkable intelligence but struggle with emerging tasks unseen during training in real-world applications. Training separate models for each new task is usually impractical. Multi-task learning (MTL) addresses this challenge by transferring shared knowledge from source tasks to target tasks. As an dominant parameter-efficient fine-tuning method, prompt tuning (PT) enhances MTL by introducing an adaptable vector that captures task-specific knowledge, which acts as a prefix to the original prompt that preserves shared knowledge, while keeping PLM parameters frozen. However, PT struggles to effectively capture the heterogeneity of task-specific knowledge due to its limited representational capacity. To address this challenge, we propose Task-Adaptive Low-Rank Representation (TA-LoRA), an MTL method built on PT, employing the low-rank representation to model task heterogeneity and a fast-slow weights mechanism where the slow weight encodes shared knowledge, while the fast weight captures task-specific nuances, avoiding the mixing of shared and task-specific knowledge, caused by training low-rank representations from scratch. Moreover, a zero-initialized attention mechanism is introduced to minimize the disruption of immature low-rank components on original prompts during warm-up epochs. Experiments on 16 tasks demonstrate that TA-LoRA achieves state-of-the-art performance in full-data and few-shot settings while maintaining superior parameter efficiency.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Reading Comprehension | C3 | Accuracy42.27 | 73 | |
| Aspect-level Sentiment Analysis | COTE BD | F1 Score93.26 | 34 | |
| Natural Language Inference | OCNLI | Accuracy68.15 | 17 | |
| Natural Language Inference | CMNLI syntactically perturbed | Accuracy74.57 | 17 | |
| Question Answering | CMRC syntactically perturbed 2018 | F1 Score82.71 | 17 | |
| Reading Comprehension | SanWen syntactically perturbed | F1 Score91.85 | 17 | |
| Semantic Similarity | BQ syntactically perturbed | Accuracy76.87 | 17 | |
| Sentiment Analysis | ChnSent | Accuracy91.7 | 17 | |
| Sentiment Analysis | Amazon syntactically perturbed | Accuracy66.82 | 17 | |
| Aspect-based Sentiment Analysis | COTE-MFW syntactically perturbed | F1 Score86.32 | 17 |