Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

PALM: Enhanced Generalizability for Local Visuomotor Policies via Perception Alignment

About

Generalizing beyond the training domain in image-based behavior cloning remains challenging. Existing methods address individual axes of generalization, workspace shifts, viewpoint changes, and cross-embodiment transfer, yet they are typically developed in isolation and often rely on complex pipelines. We introduce PALM (Perception Alignment for Local Manipulation), which leverages the invariance of local action distributions between out-of-distribution (OOD) and demonstrated domains to address these OOD shifts concurrently, without additional input modalities, model changes, or data collection. PALM modularizes the manipulation policy into coarse global components and a local policy for fine-grained actions. We reduce the discrepancy between in-domain and OOD inputs at the local policy level by enforcing local visual focus and consistent proprioceptive representation, allowing the policy to retrieve invariant local actions under OOD conditions. Experiments show that PALM limits OOD performance drops to 8% in simulation and 24% in the real world, compared to 45% and 77% for baselines.

Ruiyu Wang, Zheyu Zhuang, Danica Kragic, Florian T. Pokorny• 2026

Related benchmarks

TaskDatasetResultRank
Robotic ManipulationFour simulated robotic tasks In-domain
Success Rate73
9
Robotic ManipulationFour simulated robotic tasks Out-of-domain Workspace
Normalized Degradation-0.02
9
Robotic ManipulationFour simulated robotic tasks (Out-of-domain Viewpoint)
Normalized Degradation5
9
Robotic ManipulationFour simulated robotic tasks (Out-of-domain Embodiment)
Normalized Degradation0.1
9
Robotic ManipulationFour simulated robotic tasks Out-of-domain Workspace + Viewpoint
Normalized Degradation0.00e+0
9
Robotic ManipulationFour simulated robotic tasks (Out-of-domain Workspace + Embodiment)
Normalized Degradation0.05
9
Robotic ManipulationFour simulated robotic tasks (Out-of-domain Viewpoint + Embodiment)
Normalized Degradation0.21
9
Robotic ManipulationFour simulated robotic tasks (Out-of-domain All shifts)
Normalized Degradation0.18
9
Robotic ManipulationFour simulated robotic tasks OOD average
Normalized Degradation0.08
9
Block-stackingReal-world Robotic Experiment Stack (Workspace OOD)
Success Rate1
4
Showing 10 of 16 rows

Other info

Follow for update