TaskExpert: Dynamically Assembling Multi-Task Representations with Memorial Mixture-of-Experts

About

Learning discriminative task-specific features simultaneously for multiple distinct tasks is a fundamental problem in multi-task learning. Recent state-of-the-art models consider directly decoding task-specific features from one shared task-generic feature (e.g., feature from a backbone layer), and utilize carefully designed decoders to produce multi-task features. However, as the input feature is fully shared and each task decoder also shares decoding parameters for different input samples, it leads to a static feature decoding process, producing less discriminative task-specific representations. To tackle this limitation, we propose TaskExpert, a novel multi-task mixture-of-experts model that enables learning multiple representative task-generic feature spaces and decoding task-specific features in a dynamic manner. Specifically, TaskExpert introduces a set of expert networks to decompose the backbone feature into several representative task-generic features. Then, the task-specific features are decoded by using dynamic task-specific gating networks operating on the decomposed task-generic features. Furthermore, to establish long-range modeling of the task-specific representations from different layers of TaskExpert, we design a multi-task feature memory that updates at each layer and acts as an additional feature expert for dynamic task-specific feature decoding. Extensive experiments demonstrate that our TaskExpert clearly outperforms previous best-performing methods on all 9 metrics of two competitive multi-task learning benchmarks for visual scene understanding (i.e., PASCAL-Context and NYUD-v2). Codes and models will be made publicly available at https://github.com/prismformore/Multi-Task-Transformer

Hanrong Ye, Dan Xu• 2023

Related benchmarks

Task	Dataset	Result
Surface Normal Estimation	NYU v2 (test)	--	224
Depth Estimation	NYU Depth V2	RMSE0.5157	209
Depth Estimation	NYU V2	RMSE0.5157	167
Semantic segmentation	NYUD v2	mIoU55.35	150
Multi-task Learning	Pascal Context	mIoU (Semantic Segmentation)75.04	75
Saliency Detection	Pascal Context (test)	maxF84.87	57
Surface Normal Estimation	Pascal Context (test)	mErr13.56	50
Surface Normal Estimation	Pascal Context	Mean Error (MAE)13.56	45
Saliency Detection	Pascal Context	maxF Score84.87	45
Semantic segmentation	Pascal Context	mIoU80.64	42

Showing 10 of 16 rows

Other info

Follow for update

@wizwand_team Discord