End-to-End Multi-Task Learning with Attention
About
We propose a novel multi-task learning architecture, which allows learning of task-specific feature-level attention. Our design, the Multi-Task Attention Network (MTAN), consists of a single shared network containing a global feature pool, together with a soft-attention module for each task. These modules allow for learning of task-specific features from the global features, whilst simultaneously allowing for features to be shared across different tasks. The architecture can be trained end-to-end and can be built upon any feed-forward neural network, is simple to implement, and is parameter efficient. We evaluate our approach on a variety of datasets, across both image-to-image predictions and image classification tasks. We show that our architecture is state-of-the-art in multi-task learning compared to existing methods, and is also less sensitive to various weighting schemes in the multi-task loss function. Code is available at https://github.com/lorenmt/mtan.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Semantic segmentation | Cityscapes (test) | mIoU75.24 | 1145 | |
| Depth Estimation | NYU v2 (test) | -- | 423 | |
| Semantic segmentation | NYU v2 (test) | mIoU52.1 | 248 | |
| Surface Normal Estimation | NYU v2 (test) | Mean Angle Distance (MAD)16.6 | 206 | |
| Depth Estimation | NYU Depth V2 | -- | 177 | |
| Semantic segmentation | NYU Depth V2 (test) | mIoU40.01 | 172 | |
| Surface Normal Prediction | NYU V2 | Mean Error16.5 | 100 | |
| Semantic segmentation | NYUD v2 | mIoU39.39 | 96 | |
| Multi-Label Classification | ChestX-Ray14 (test) | -- | 88 | |
| Semantic segmentation | Cityscapes v1 (test) | mIoU56.55 | 74 |