GLOVER++: Unleashing the Potential of Affordance Learning from Human Behaviors for Robotic Manipulation
About
Learning manipulation skills from human demonstration videos offers a promising path toward generalizable and interpretable robotic intelligence-particularly through the lens of actionable affordances. However, transferring such knowledge remains challenging due to: 1) a lack of large-scale datasets with precise affordance annotations, and 2) insufficient exploration of affordances in diverse manipulation contexts. To address these gaps, we introduce HOVA-500K, a large-scale, affordance-annotated dataset comprising 500,000 images across 1,726 object categories and 675 actions. We also release a standardized benchmarking suite for multi-modal affordance reasoning. Built upon HOVA-500K, we present GLOVER++, a global-to-local affordance training framework that effectively transfers actionable affordance knowledge from human demonstrations to downstream open-vocabulary reasoning tasks. GLOVER++ achieves state-of-the-art results on the HOVA-500K benchmark and demonstrates strong generalization across diverse downstream robotic manipulation tasks. By explicitly modeling actionable affordances, GLOVER++ facilitates robust transfer across scenes, modalities, and tasks. We hope that HOVA-500K and the GLOVER++ framework will serve as valuable resources for bridging the gap between human demonstrations and robotic manipulation capabilities.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Contact Point Evaluation | HANDAL Mini | Point Hit Rate67.6 | 4 | |
| Contact Point Evaluation | HANDAL Easy | Point Hit Rate39.2 | 4 | |
| Contact Point Evaluation | HANDAL Hard | Point Hit Rate34.4 | 4 | |
| Contact Point Evaluation | HOVA-500K | Point Hit Rate28.6 | 4 | |
| Contact Point Evaluation | 3DOI | Point Hit Rate6.8 | 4 | |
| Contact Point Evaluation | 3DOI Easy | Point Hit Rate4.9 | 4 | |
| Contact Point Evaluation | ReasonAff | Point Hit Rate4.3 | 4 | |
| Contact Point Evaluation | InstructPart | Point Hit Rate4.7 | 4 |