Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

TaskExpert: Dynamically Assembling Multi-Task Representations with Memorial Mixture-of-Experts

About

Learning discriminative task-specific features simultaneously for multiple distinct tasks is a fundamental problem in multi-task learning. Recent state-of-the-art models consider directly decoding task-specific features from one shared task-generic feature (e.g., feature from a backbone layer), and utilize carefully designed decoders to produce multi-task features. However, as the input feature is fully shared and each task decoder also shares decoding parameters for different input samples, it leads to a static feature decoding process, producing less discriminative task-specific representations. To tackle this limitation, we propose TaskExpert, a novel multi-task mixture-of-experts model that enables learning multiple representative task-generic feature spaces and decoding task-specific features in a dynamic manner. Specifically, TaskExpert introduces a set of expert networks to decompose the backbone feature into several representative task-generic features. Then, the task-specific features are decoded by using dynamic task-specific gating networks operating on the decomposed task-generic features. Furthermore, to establish long-range modeling of the task-specific representations from different layers of TaskExpert, we design a multi-task feature memory that updates at each layer and acts as an additional feature expert for dynamic task-specific feature decoding. Extensive experiments demonstrate that our TaskExpert clearly outperforms previous best-performing methods on all 9 metrics of two competitive multi-task learning benchmarks for visual scene understanding (i.e., PASCAL-Context and NYUD-v2). Codes and models will be made publicly available at https://github.com/prismformore/Multi-Task-Transformer

Hanrong Ye, Dan Xu• 2023

Related benchmarks

TaskDatasetResultRank
Surface Normal EstimationNYU v2 (test)--
206
Depth EstimationNYU Depth V2
RMSE0.5157
177
Semantic segmentationNYUD v2
mIoU55.35
96
Saliency DetectionPascal Context (test)
maxF84.87
57
Surface Normal EstimationPascal Context (test)
mErr13.56
50
Multi-task LearningPascal Context
mIoU (Semantic Segmentation)75.04
47
Boundary DetectionPascal Context (test)
ODSF73.3
34
Human Part ParsingPascal Context (test)
mIoU69.42
20
Boundary DetectionNYUD v2
ODS F-measure78.4
17
Boundary DetectionNYUD2
ODS Fmax78.4
15
Showing 10 of 11 rows

Other info

Follow for update