HEADER: Hierarchical Robot Exploration via Attention-Based Deep Reinforcement Learning with Expert-Guided Reward

About

This work pushes the boundaries of learning-based methods in autonomous robot exploration in terms of environmental scale and exploration efficiency. We present HEADER, an attention-based reinforcement learning approach with hierarchical graphs for efficient exploration in large-scale environments. HEADER follows existing conventional methods to construct hierarchical representations for the robot belief/map, but further designs a novel community-based algorithm to construct and update a global graph, which remains fully incremental, shape-adaptive, and operates with linear complexity. Building upon attention-based networks, our planner finely reasons about the nearby belief within the local range while coarsely leveraging distant information at the global scale, enabling next-best-viewpoint decisions that consider multi-scale spatial dependencies. Beyond novel map representation, we introduce a parameter-free privileged reward that significantly improves model performance and produces near-optimal exploration behaviors, by avoiding training objective bias caused by handcrafted reward shaping. In simulated challenging, large-scale exploration scenarios, HEADER demonstrates better scalability than most existing learning and non-learning methods, while achieving a significant improvement in exploration efficiency (up to 20%) over state-of-the-art baselines. We also deploy HEADER on hardware and validate it in complex, large-scale real-life scenarios, including a 300m*230m campus environment.

Yuhong Cao, Yizhuo Wang, Jingsong Liang, Shuhao Liao, Yifeng Zhang, Peizhuo Li, Guillaume Sartoretti• 2025

Related benchmarks

Task	Dataset	Result
Autonomous Robotic Exploration	Warehouse Gazebo simulation (environment)	Exploration Distance (m)492	5
Robotic Exploration	Forest Gazebo simulation	Distance (m)1.23e+3	5
Robotic Exploration	Forest Gazebo simulation (test)	Distance (m)1.23e+5	5
Robotic Exploration	Warehouse Gazebo simulation (test)	Distance (m)4.92e+4	5
Robotic Exploration	Indoor Gazebo simulation (test)	Exploration Distance (m)1.03e+5	5

Showing 5 of 5 rows

Other info

Follow for update

@wizwand_team Discord