DenseMTL: Cross-task Attention Mechanism for Dense Multi-task Learning

About

Multi-task learning has recently emerged as a promising solution for a comprehensive understanding of complex scenes. In addition to being memory-efficient, multi-task models, when appropriately designed, can facilitate the exchange of complementary signals across tasks. In this work, we jointly address 2D semantic segmentation and three geometry-related tasks: dense depth estimation, surface normal estimation, and edge estimation, demonstrating their benefits on both indoor and outdoor datasets. We propose a novel multi-task learning architecture that leverages pairwise cross-task exchange through correlation-guided attention and self-attention to enhance the overall representation learning for all tasks. We conduct extensive experiments across three multi-task setups, showing the advantages of our approach compared to competitive baselines in both synthetic and real-world benchmarks. Additionally, we extend our method to the novel multi-task unsupervised domain adaptation setting. Our code is available at https://github.com/cv-rits/DenseMTL

Ivan Lopes, Tuan-Hung Vu, Raoul de Charette• 2022

Related benchmarks

Task	Dataset	Result
Depth Estimation	NYU v2 (test)	--	435
Surface Normal Estimation	NYU v2 (test)	--	224
Semantic segmentation	NYUD v2 (test)	mIoU40.84	187
Multi-task Learning	Cityscapes (test)	MR40.05	43
Semantic segmentation	VKITTI2 -> Cityscapes 8 classes	mIoU63.76	19
Edge Detection	NYUD v2 (test)	--	16
Semantic segmentation	SYNTHIA to Cityscapes 16 classes (test)	mIoU37.93	13
Multi-task Learning	Synthia (test)	mIoU82.99	10
Multi-task Learning	vKITTI 2 (test)	mIoU97.53	10
Monocular Depth Estimation	SYNTHIA to Cityscapes 16 classes UDA (val)	Root Mean Squared Error (RMSE)11.66	9

Showing 10 of 14 rows

Other info

Code

Follow for update

@wizwand_team Discord