Multi-Objective Reinforcement Learning for Adaptable Personalized Autonomous Driving

About

Human drivers exhibit individual preferences regarding driving style. Adapting autonomous vehicles to these preferences is essential for user trust and satisfaction. However, existing end-to-end driving approaches often rely on predefined driving styles or require continuous user feedback for adaptation, limiting their ability to support dynamic, context-dependent preferences. We propose a novel approach using multi-objective reinforcement learning (MORL) with preference-driven optimization for end-to-end autonomous driving that enables runtime adaptation to driving style preferences. Preferences are encoded as continuous weight vectors to modulate behavior along interpretable style objectives$\unicode{x2013}$including efficiency, comfort, speed, and aggressiveness$\unicode{x2013}$without requiring policy retraining. Our single-policy agent integrates vision-based perception in complex mixed-traffic scenarios and is evaluated in diverse urban environments using the CARLA simulator. Experimental results demonstrate that the agent dynamically adapts its driving behavior according to changing preferences while maintaining performance in terms of collision avoidance and route completion.

Hendrik Surmann, Jorge de Heuvel, Maren Bennewitz• 2025

Related benchmarks

Task	Dataset	Result	Rank
Long-term preference alignment	StyleDrive (ID)	Alignment Score (D1)42		2
Long-term preference alignment	StyleDrive (OOD)	Alignment Score (D3)25		2

Showing 2 of 2 rows

Other info

Follow for update

@wizwand_team Discord