Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Enabling Extensible Embodied Capabilities with Tools

About

Most existing embodied intelligence methods formulate perception, reasoning, planning, and control within a unified parameterized policy. Yet these capabilities are inherently hierarchical and heterogeneous, making them difficult to reliably learn and modularize within a single model. We propose a capability externalization approach that decouples heterogeneous capabilities into independently optimized tools, dynamically invoked at inference time. To this end, we introduce Embodied Tool Protocol (ETP), a standardized protocol for embodied tool registration, discovery, invocation, and execution, and curate 100+ validated tools spanning perception, cognition, reasoning, and execution as the tool base. Building on this, we construct EmbodiedToolBench to evaluate both whether tool augmentation improves embodied performance and how well current models use tools across tool-necessity recognition, tool selection, tool execution, and tool-chain composition. Experiments across simulation and real-world platforms confirm that capability externalization consistently improves embodied performance (avg. gain 31% on EB-ALFRED and 36% on EB-Navigation), yet reveal a clear boundary: gains are substantial for cognition and perception but are limited for execution-type capabilities. Moreover, our analysis reveals that knowing when, which, and how to invoke tools remains a persistent challenge across all models, thereby highlighting embodied tool competence as a critical direction for future research.

Xueyang Zhou, Zijia Wang, Qianjiang Li, Yibo Hu, Guiyao Tie, Li Wan, Yidan Liu, Pan Zhou, Lichao Sun, Yongchao Chen• 2026

Related benchmarks

TaskDatasetResultRank
Embodied Task CompletionEB-Habitat
Avg Success Rate77
63
Embodied Task CompletionALFRED EB
Avg Score92
36
Balance Scale ManipulationReal-world Robot Tasks 1.0 (test)
Success Rate90
16
Building BlocksReal-world Robot Tasks 1.0 (test)
Success Rate80
16
Desktop CleaningReal-world Robot Tasks 1.0 (test)
Success Rate8
16
Embodied ManipulationEB-Manipulation
Average Score42
16
Embodied NavigationEB-Navigation
Average Score85
16
Showing 7 of 7 rows

Other info

Follow for update