Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Multi-Task Interactive Robot Fleet Learning with Visual World Models

About

Recent advancements in large-scale multi-task robot learning offer the potential for deploying robot fleets in household and industrial settings, enabling them to perform diverse tasks across various environments. However, AI-enabled robots often face challenges with generalization and robustness when exposed to real-world variability and uncertainty. We introduce Sirius-Fleet, a multi-task interactive robot fleet learning framework to address these challenges. Sirius-Fleet monitors robot performance during deployment and involves humans to correct the robot's actions when necessary. We employ a visual world model to predict the outcomes of future actions and build anomaly predictors to predict whether they will likely result in anomalies. As the robot autonomy improves, the anomaly predictors automatically adapt their prediction criteria, leading to fewer requests for human intervention and gradually reducing human workload over time. Evaluations on large-scale benchmarks demonstrate Sirius-Fleet's effectiveness in improving multi-task policy performance and monitoring accuracy. We demonstrate Sirius-Fleet's performance in both RoboCasa in simulation and Mutex in the real world, two diverse, large-scale multi-task benchmarks. More information is available on the project website: https://ut-austin-rpl.github.io/sirius-fleet

Huihan Liu, Yu Zhang, Vaarij Betala, Evan Zhang, James Liu, Crystal Ding, Yuke Zhu• 2024

Related benchmarks

TaskDatasetResultRank
Failure DetectionLIBERO-10 Seen Tasks
bACC59.1
28
Failure DetectionLIBERO 10 Unseen Tasks
bACC48.6
28
Failure DetectionVLABench (Seen Tasks)
Balanced Accuracy (bACC)64.2
12
Failure DetectionVLABench (Unseen Tasks)
bACC61.6
12
Failure DetectionBimanual Cable Manipulation (32 folds)
Nominal Accuracy66.9
8
Confidence EstimationLIBERO Pre-Execution spatial, object, goal
ECE30.31
8
Confidence EstimationLIBERO Online Execution spatial object goal
ECE0.3689
8
Failure DetectionCUBE (Seen)
bACC52.5
8
Failure DetectionCUBE (Unseen)
bACC47.2
8
Failure DetectionKITCHEN (Seen)
bACC54.1
8
Showing 10 of 14 rows

Other info

Follow for update