Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

RoboFAC: A Comprehensive Framework for Robotic Failure Analysis and Correction

About

Vision-Language-Action (VLA) models have recently advanced robotic manipulation by translating natural-language instructions and visual observations into control actions. However, existing VLAs are primarily trained on successful expert demonstrations and lack structured supervision for failure diagnosis and recovery, limiting robustness in open-world scenarios. To address this limitation, we propose the Robotic Failure Analysis and Correction (RoboFAC) framework. We construct a large-scale failure-centric dataset comprising 9,440 erroneous manipulation trajectories and 78,623 QA pairs across 53 scenes in both simulation and real-world environments, with systematically categorized failure types. Leveraging this dataset, we develop a lightweight multimodal model specialized for task understanding, failure analysis, and failure correction, enabling efficient local deployment while remaining competitive with large proprietary models. Experimental results demonstrate that RoboFAC achieves a 34.1% higher failure analysis accuracy compared to GPT-4o. Furthermore, we integrated RoboFAC as an external supervisor in a real-world VLA control pipeline, yielding a 29.1% relative improvement across four tasks while significantly reducing latency relative to GPT-4o. These results demonstrate that RoboFAC enables systematic failure diagnosis and recovery, significantly enhancing VLA recovery capabilities. Our model and dataset are publicly available at https://github.com/MINT-SJTU/RoboFAC.

Zewei Ye, Weifeng Lu, Minghao Ye, Tao Lin, Shuo Yang, Junchi Yan, Bo Zhao• 2025

Related benchmarks

TaskDatasetResultRank
Robotic ManipulationManiSkill3
Average Success Rate82.7
21
Robotic ManipulationReal-world Manipulation SO-100
Place Success Rate60
10
Robot Failure Analysis (MCQ)RoboFAC Simulation
FD Score91
7
Robot Failure Analysis (MCQ)RoboFAC (Real-world)
FD80
7
Robotic Failure AnalysisRoboFAC 1.0 (mixed simulated and real-world)
Task Success Rate (Short Horizon)82.74
6
Free-language reasoningRoboFAC Simulation
ROUGE-L (TI)32.3
4
Free-language reasoningRoboFAC (Real-world)
ROUGE-L (TI)33.7
4
Robot Failure ExplanationRoboFail
Coherence Score (CS)0.452
3
Showing 8 of 8 rows

Other info

Follow for update