Meta-Learning-Based Deep Reinforcement Learning for Multiobjective Optimization Problems

About

Deep reinforcement learning (DRL) has recently shown its success in tackling complex combinatorial optimization problems. When these problems are extended to multiobjective ones, it becomes difficult for the existing DRL approaches to flexibly and efficiently deal with multiple subproblems determined by weight decomposition of objectives. This paper proposes a concise meta-learning-based DRL approach. It first trains a meta-model by meta-learning. The meta-model is fine-tuned with a few update steps to derive submodels for the corresponding subproblems. The Pareto front is then built accordingly. Compared with other learning-based methods, our method can greatly shorten the training time of multiple submodels. Due to the rapid and excellent adaptability of the meta-model, more submodels can be derived so as to increase the quality and diversity of the found solutions. The computational experiments on multiobjective traveling salesman problems and multiobjective vehicle routing problem with time windows demonstrate the superiority of our method over most of learning-based and iteration-based approaches.

Zizhen Zhang, Zhiyuan Wu, Hang Zhang, Jiahai Wang• 2021

Related benchmarks

Task	Dataset	Result
Multi-Objective Traveling Salesperson Problem	KroAB200	Hypervolume (HV)72.61	44
Bi-objective Traveling Salesman Problem	Bi-TSP50	Hypervolume (HV)0.6408	44
Tri-Objective Traveling Salesman Problem	Tri-TSP50	Hypervolume (HV)0.4408	44
Multi-Objective Traveling Salesperson Problem	KroAB100	Hypervolume (HV)0.695	44
Multi-Objective Traveling Salesperson Problem	KroAB150	Hypervolume (HV)68.9	44
Multi-objective Knapsack Problem	Bi-KP n=100	HV0.4532	34
Multi-objective Knapsack Problem	Bi-KP n=200	HV0.3601	34
Multi-objective Knapsack Problem	Bi-KP n=50	HV0.353	34
Tri-Objective Traveling Salesman Problem	Tri-TSP 20 nodes (test)	Hypervolume (HV)0.4712	24
Bi-objective Traveling Salesman Problem	Bi-TSP20	Hypervolume (HV)0.6271	24

Showing 10 of 37 rows

Other info

Follow for update

@wizwand_team Discord