Don't Yell at Your Robot: Physical Correction as the Collaborative Interface for Language Model Powered Robots
About
We present a novel approach for enhancing human-robot collaboration using physical interactions for real-time error correction of large language model (LLM) powered robots. Unlike other methods that rely on verbal or text commands, the robot leverages an LLM to proactively executes 6 DoF linear Dynamical System (DS) commands using a description of the scene in natural language. During motion, a human can provide physical corrections, used to re-estimate the desired intention, also parameterized by linear DS. This corrected DS can be converted to natural language and used as part of the prompt to improve future LLM interactions. We provide proof-of-concept result in a hybrid real+sim experiment, showcasing physical interaction as a new possibility for LLM powered human-robot interface.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Robot Task Execution | Robot Task Scenarios Scenario S3 | Success Rate61 | 13 | |
| Object Selection Among Similar Items | Scenario S1 | Success Rate100 | 9 | |
| Single-Step Action Execution | Scenario S2 | Success Rate100 | 9 | |
| Causal Action Execution | Scenario S4 | Success Rate52 | 9 |