Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

FUNCanon: Learning Pose-Aware Action Primitives via Functional Object Canonicalization for Generalizable Robotic Manipulation

About

General-purpose robotic skills from end-to-end demonstrations often leads to task-specific policies that fail to generalize beyond the training distribution. Therefore, we introduce FunCanon, a framework that converts long-horizon manipulation tasks into sequences of action chunks, each defined by an actor, verb, and object. These chunks focus policy learning on the actions themselves, rather than isolated tasks, enabling compositionality and reuse. To make policies pose-aware and category-general, we perform functional object canonicalization for functional alignment and automatic manipulation trajectory transfer, mapping objects into shared functional frames using affordance cues from large vision language models. An object centric and action centric diffusion policy FuncDiffuser trained on this aligned data naturally respects object affordances and poses, simplifying learning and improving generalization ability. Experiments on simulated and real-world benchmarks demonstrate category-level generalization, cross-task behavior reuse, and robust sim2real deployment, showing that functional canonicalization provides a strong inductive bias for scalable imitation learning in complex manipulation domains. Details of the demo and supplemental material are available on our project website https://sites.google.com/view/funcanon.

Hongli Xu, Lei Zhang, Xiaoyue Hu, Boyang Zhong, Kaixin Bai, Zolt\'an-Csaba M\'arton, Zhenshan Bing, Zhaopeng Chen, Alois Christian Knoll, Jianwei Zhang• 2025

Related benchmarks

TaskDatasetResultRank
Pick-&-PlaceReal-world (Unseen)
Success Rate88
9
Pick-&-PlaceRLBench Put A in B (Pose-level substitution)
Success Rate60
3
Pick-&-PlaceRLBench Put A in B Instance-level substitution
Success Rate61.3
3
Pick-&-PlaceRLBench Put A in B Category-level substitution
Success Rate60
3
Pick, Pour (L1)Real World unknown objects
Success Rate90
3
Pick, Pour (L2)Real World unknown objects
Success Rate88
3
Pour (Level 1)RLBench Pour A in B (Pose-level substitution)
Success Rate77.3
3
Pour (Level 1)RLBench Pour A in B Instance-level substitution
Success Rate80
3
Pour (Level 1)RLBench Pour A in B (Category-level substitution)
Success Rate72
3
Pour (Level 2)RLBench Water B with A (Pose-level substitution)
Success Rate80
3
Showing 10 of 12 rows

Other info

Follow for update