In Defense of the Unitary Scalarization for Deep Multi-Task Learning

About

Recent multi-task learning research argues against unitary scalarization, where training simply minimizes the sum of the task losses. Several ad-hoc multi-task optimization algorithms have instead been proposed, inspired by various hypotheses about what makes multi-task settings difficult. The majority of these optimizers require per-task gradients, and introduce significant memory, runtime, and implementation overhead. We show that unitary scalarization, coupled with standard regularization and stabilization techniques from single-task learning, matches or improves upon the performance of complex multi-task optimizers in popular supervised and reinforcement learning settings. We then present an analysis suggesting that many specialized multi-task optimizers can be partly interpreted as forms of regularization, potentially explaining our surprising results. We believe our results call for a critical reevaluation of recent research in the area.

Vitaly Kurin, Alessandro De Palma, Ilya Kostrikov, Shimon Whiteson, M. Pawan Kumar• 2022

Related benchmarks

Task	Dataset	Result
Depth Estimation	NYU v2 (test)	--	435
Image Classification	Office-Home (test)	--	328
Semantic segmentation	NYU v2 (test)	mIoU52.02	282
Surface Normal Estimation	NYU v2 (test)	Mean Angle Distance (MAD)23.79	224
Multi-task Learning	NYU v2 (test)	--	31
Multi-task Learning	NYU V2	mIoU53.77	26
Multi-Objective Learning	Office-31	Amazon Accuracy0.8102	8
Image Classification	MNIST (test)	Cross-Entropy Loss306.9	3

Showing 8 of 8 rows

Other info

Follow for update

@wizwand_team Discord