Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Efficient Real-World Autonomous Racing via Attenuated Residual Policy Optimization

About

Residual policy learning (RPL), in which a learned policy refines a static base policy using deep reinforcement learning (DRL), has shown strong performance across various robotic applications. Its effectiveness is particularly evident in autonomous racing, a domain that serves as a challenging benchmark for real-world DRL. However, deploying RPL-based controllers introduces system complexity and increases inference latency. We address this by introducing an extension of RPL named attenuated residual policy optimization ($\alpha$-RPO). Unlike standard RPL, $\alpha$-RPO yields a standalone neural policy by progressively attenuating the base policy, which initially serves to bootstrap learning. Furthermore, this mechanism enables a form of privileged learning, where the base policy is permitted to use sensor modalities not required for final deployment. We design $\alpha$-RPO to integrate seamlessly with PPO, ensuring that the attenuated influence of the base controller is dynamically compensated during policy optimization. We evaluate $\alpha$-RPO by building a framework for 1:10-scaled autonomous racing around it. In both simulation and zero-shot real-world transfer to Roboracer cars, $\alpha$-RPO not only reduces system complexity but also improves driving performance compared to baselines - demonstrating its practicality for robotic deployment. Our code is available at: https://github.com/raphajaner/arpo_racing.

Raphael Trumpp, Denis Hoornaert, Mirco Theile, Marco Caccamo• 2026

Related benchmarks

TaskDatasetResultRank
Autonomous RacingSimulated Racetracks (train)
Total Race Time34.13
42
Autonomous RacingSimulated Racetracks (test)
Total Race Time28.91
42
Autonomous RacingMunich racetrack (5 full laps)
Finishing Time (s)28.1
4
Showing 3 of 3 rows

Other info

Follow for update