Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

PD-MORL: Preference-Driven Multi-Objective Reinforcement Learning Algorithm

About

Multi-objective reinforcement learning (MORL) approaches have emerged to tackle many real-world problems with multiple conflicting objectives by maximizing a joint objective function weighted by a preference vector. These approaches find fixed customized policies corresponding to preference vectors specified during training. However, the design constraints and objectives typically change dynamically in real-life scenarios. Furthermore, storing a policy for each potential preference is not scalable. Hence, obtaining a set of Pareto front solutions for the entire preference space in a given domain with a single training is critical. To this end, we propose a novel MORL algorithm that trains a single universal network to cover the entire preference space scalable to continuous robotic tasks. The proposed approach, Preference-Driven MORL (PD-MORL), utilizes the preferences as guidance to update the network parameters. It also employs a novel parallelization approach to increase sample efficiency. We show that PD-MORL achieves up to 25% larger hypervolume for challenging continuous control tasks and uses an order of magnitude fewer trainable parameters compared to prior approaches.

Toygun Basaklar, Suat Gumussoy, Umit Y. Ogras• 2022

Related benchmarks

TaskDatasetResultRank
Continuous ControlMuJoCo Hopper3d
UT Score1.29
11
Continuous ControlMuJoCo Ant3d
UT1.29
11
Continuous ControlMuJoCo Halfcheetah2d
UT Score3.17
11
Continuous ControlMuJoCo Humanoid5d
Undiscounted Return (UT)0.38
11
Continuous ControlMuJoCo Walker2d
Uncertainty Time (UT)1.7
11
Continuous ControlMuJoCo Humanoid2d
UT Score-0.05
11
Multi-objective Reinforcement LearningDeep Sea Treasure
Hypervolume (HV)9.33
10
Multi-objective Reinforcement LearningMuJoCo 8 continuous-action tasks MO-Gymnasium (aggregated)
Hypervolume (HV)3.25
7
Multi-objective Reinforcement LearningFruit Tree Navigation
UT5.03
7
Showing 9 of 9 rows

Other info

Follow for update