Multi-Task Learning as a Bargaining Game
About
In Multi-task learning (MTL), a joint model is trained to simultaneously make predictions for several tasks. Joint training reduces computation costs and improves data efficiency; however, since the gradients of these different tasks may conflict, training a joint model for MTL often yields lower performance than its corresponding single-task counterparts. A common method for alleviating this issue is to combine per-task gradients into a joint update direction using a particular heuristic. In this paper, we propose viewing the gradients combination step as a bargaining game, where tasks negotiate to reach an agreement on a joint direction of parameter update. Under certain assumptions, the bargaining problem has a unique solution, known as the Nash Bargaining Solution, which we propose to use as a principled approach to multi-task learning. We describe a new MTL optimization procedure, Nash-MTL, and derive theoretical guarantees for its convergence. Empirically, we show that Nash-MTL achieves state-of-the-art results on multiple MTL benchmarks in various domains.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Semantic segmentation | Cityscapes (test) | mIoU75.41 | 1145 | |
| Depth Estimation | NYU v2 (test) | -- | 423 | |
| Semantic segmentation | NYU v2 (test) | mIoU51.73 | 248 | |
| Surface Normal Estimation | NYU v2 (test) | Mean Angle Distance (MAD)23.21 | 206 | |
| Depth Estimation | NYU Depth V2 | RMSE0.78 | 177 | |
| Semantic segmentation | NYU Depth V2 (test) | mIoU40.13 | 172 | |
| Semantic segmentation | NYUD v2 | mIoU31.32 | 96 | |
| Multi-Label Classification | ChestX-Ray14 (test) | -- | 88 | |
| Multi-task Learning | Cityscapes (test) | MR1.75 | 43 | |
| Depth Estimation | Cityscapes (test) | Abs Err0.0129 | 40 |