Preference-Driven Multi-Objective Combinatorial Optimization with Conditional Computation
About
Recent deep reinforcement learning methods have achieved remarkable success in solving multi-objective combinatorial optimization problems (MOCOPs) by decomposing them into multiple subproblems, each associated with a specific weight vector. However, these methods typically treat all subproblems equally and solve them using a single model, hindering the effective exploration of the solution space and thus leading to suboptimal performance. To overcome the limitation, we propose POCCO, a novel plug-and-play framework that enables adaptive selection of model structures for subproblems, which are subsequently optimized based on preference signals rather than explicit reward values. Specifically, we design a conditional computation block that routes subproblems to specialized neural architectures. Moreover, we propose a preference-driven optimization algorithm that learns pairwise preferences between winning and losing solutions. We evaluate the efficacy and versatility of POCCO by applying it to two state-of-the-art neural methods for MOCOPs. Experimental results across four classic MOCOP benchmarks demonstrate its significant superiority and strong generalization.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Bi-objective Traveling Salesman Problem | Bi-TSP50 | Hypervolume (HV)0.6418 | 20 | |
| Bi-objective Traveling Salesman Problem | Bi-TSP150 | Hypervolume (HV)0.7062 | 20 | |
| Bi-objective Traveling Salesman Problem | Bi-TSP200 | Hypervolume (HV)73.99 | 20 | |
| Bi-objective Traveling Salesman Problem | Bi-TSP20 | HV0.6275 | 20 | |
| Bi-objective Traveling Salesman Problem | Bi-TSP100 | Hypervolume (HV)0.7078 | 20 | |
| Multi-Objective Traveling Salesperson Problem | KroAB100 | Hypervolume (HV)0.7006 | 20 | |
| Multi-Objective Traveling Salesperson Problem | KroAB150 | Hypervolume (HV)69.76 | 20 | |
| Multi-Objective Traveling Salesperson Problem | KroAB200 | Hypervolume (HV)73.69 | 20 | |
| Tri-Objective Traveling Salesman Problem | Tri-TSP50 | Hypervolume (HV)0.4437 | 20 | |
| Tri-Objective Traveling Salesman Problem | Tri-TSP100 | Hypervolume (HV)50.48 | 20 |