Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Adapting Auxiliary Losses Using Gradient Similarity

About

One approach to deal with the statistical inefficiency of neural networks is to rely on auxiliary losses that help to build useful representations. However, it is not always trivial to know if an auxiliary task will be helpful for the main task and when it could start hurting. We propose to use the cosine similarity between gradients of tasks as an adaptive weight to detect when an auxiliary loss is helpful to the main loss. We show that our approach is guaranteed to converge to critical points of the main task and demonstrate the practical usefulness of the proposed algorithm in a few domains: multi-task supervised learning on subsets of ImageNet, reinforcement learning on gridworld, and reinforcement learning on Atari games.

Yunshu Du, Wojciech M. Czarnecki, Siddhant M. Jayakumar, Mehrdad Farajtabar, Razvan Pascanu, Balaji Lakshminarayanan• 2018

Related benchmarks

TaskDatasetResultRank
Depth EstimationNYU v2 (test)--
423
Semantic segmentationNYU v2 (test)
mIoU52.67
248
Surface Normal EstimationNYU v2 (test)
Mean Angle Distance (MAD)24.1
206
Few-shot classificationMeta-Dataset (test)--
48
Multi-task LearningNYU v2 (test)
Delta m%9
31
Multi-task RecommendationAliExpress (test)
CTR ES0.7229
16
Image RecognitionDomainNet 50% test split (val)
Accuracy (Clipart)74.6
16
Showing 7 of 7 rows

Other info

Follow for update