Visual-RRT: Finding Paths toward Visual-Goals via Differentiable Rendering
About
Rapidly-exploring random trees (RRTs) have been widely adopted for robot motion planning due to their robustness and theoretical guarantees. However, existing RRT-based planners require explicit goal configurations specified as numerical joint angles, while many practical applications provide goal specifications through visual observations such as images or demonstration videos where precise goal configurations are unavailable. In this paper, we propose visual-RRT (vRRT), a motion planner that enables visual-goal planning by unifying gradient-based exploitation from differentiable robot rendering with sampling-based exploration from RRTs. We further introduce (i) a frontier-based exploration-exploitation strategy that adaptively prioritizes visually promising search regions, and (ii) inertial gradient tree expansion that inherits optimization states across tree branches for momentum-consistent gradient exploitation. Extensive experiments across various robot manipulators including Franka, UR5e, and Fetch demonstrate that vRRT achieves effective visual-goal planning in both simulated and real-world settings, bridging the gap between sampling-based planning and vision-centric robot applications. Our code is available at https://sgvr.kaist.ac.kr/Visual-RRT.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Visual-goal pose reconstruction | Franka Robot Environment | Success Rate100 | 12 | |
| Visual-goal pose reconstruction | UR5e Robot Environment | Success Rate100 | 12 | |
| Visual-goal pose reconstruction | Fetch Robot Environment | Success Rate (%)100 | 12 | |
| Visual-goal motion planning | Franka 0.5 rad bin 1.0 (test) | Success Rate (SR)93 | 4 | |
| Visual-goal motion planning | Franka rad bin 1.0 (test) | Success Rate89 | 4 | |
| Visual-goal motion planning | Franka 1.5 rad bin 1.0 (test) | Success Rate81.7 | 4 | |
| Visual-goal motion planning | Franka 2.0 rad bin 1.0 (test) | Success Rate (SR)67.7 | 4 | |
| Visual-goal motion planning | Franka 2.5 rad bin 1.0 (test) | Success Rate44.7 | 4 | |
| Visual-goal motion planning | Franka Avg. 1.0 (test) | Success Rate (SR)75.2 | 4 | |
| Visual-goal motion planning | UR5e 0.5 rad bin 1.0 (test) | Success Rate (SR)87.3 | 4 |