Direction-oriented Multi-objective Learning: Simple and Provable Stochastic Algorithms
About
Multi-objective optimization (MOO) has become an influential framework in many machine learning problems with multiple objectives such as learning with multiple criteria and multi-task learning (MTL). In this paper, we propose a new direction-oriented multi-objective problem by regularizing the common descent direction within a neighborhood of a direction that optimizes a linear combination of objectives such as the average loss in MTL. This formulation includes GD and MGDA as special cases, enjoys the direction-oriented benefit as in CAGrad, and facilitates the design of stochastic algorithms. To solve this problem, we propose Stochastic Direction-oriented Multi-objective Gradient descent (SDMGrad) with simple SGD type of updates, and its variant SDMGrad-OS with an efficient objective sampling in the setting where the number of objectives is large. For a constant-level regularization parameter $\lambda$, we show that SDMGrad and SDMGrad-OS provably converge to a Pareto stationary point with improved complexities and milder assumptions. For an increasing $\lambda$, this convergent point reduces to a stationary point of the linear combination of objectives. We demonstrate the superior performance of the proposed methods in a series of tasks on multi-task supervised learning and reinforcement learning. Code is provided at https://github.com/ml-opt-lab/sdmgrad.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Semantic segmentation | Cityscapes (test) | mIoU74.53 | 1145 | |
| Depth Estimation | NYU v2 (test) | -- | 423 | |
| Surface Normal Estimation | NYU v2 (test) | Mean Angle Distance (MAD)25.07 | 206 | |
| Semantic segmentation | NYU Depth V2 (test) | mIoU40.47 | 172 | |
| Multi-task Learning | Cityscapes (test) | MR6.25 | 43 | |
| Depth Estimation | Cityscapes (test) | Abs Err0.0137 | 40 | |
| Multi-task reinforcement learning | Metaworld MT10 v2 (train/eval) | Time6.8 | 11 |