Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

PRISM: Performer RS-IMLE for Single-pass Multisensory Imitation Learning

About

Robotic imitation learning typically requires models that capture multimodal action distributions while operating at real-time control rates and accommodating multiple sensing modalities. Although recent generative approaches such as diffusion models, flow matching, and Implicit Maximum Likelihood Estimation (IMLE) have achieved promising results, they often satisfy only a subset of these requirements. To address this, we introduce PRISM, a single-pass policy based on a batch-global rejection-sampling variant of IMLE. PRISM couples a temporal multisensory encoder (integrating RGB, depth, tactile, audio, and proprioception) with a linear-attention generator using a Performer architecture. We demonstrate the efficacy of PRISM on a diverse real-world hardware suite, including loco-manipulation using a Unitree Go2 with a 7-DoF arm D1 and tabletop manipulation with a UR5 manipulator. Across challenging physical tasks such as pre-manipulation parking, high-precision insertion, and multi-object pick-and-place, PRISM outperforms state-of-the-art diffusion policies by 10-25% in success rate while maintaining high-frequency (30-50 Hz) closed-loop control. We further validate our approach on large-scale simulation benchmarks, including CALVIN, MetaWorld, and Robomimic. In CALVIN (10% data split), PRISM improves success rates by approximately 25% over diffusion and approximately 20% over flow matching, while simultaneously reducing trajectory jerk by 20x-50x. These results position PRISM as a fast, accurate, and multisensory imitation policy that retains multimodal action coverage without the latency of iterative sampling.

Amisha Bhaskar, Pratap Tokekar, Stefano Di Cairano, Alexander Schperberg• 2026

Related benchmarks

TaskDatasetResultRank
Robot ManipulationMetaWorld Medium 11 tasks
Success Rate85.5
18
Robot ManipulationMetaWorld Hard (6 tasks)
Success Rate58
18
Robot ManipulationMetaWorld Very Hard 5 tasks
Success Rate85.8
15
Robotic Arm ManipulationMetaWorld Easy
Success Rate96.4
15
Robotic Arm ManipulationMetaWorld Very Hard
Success Rate85.8
15
Robot ManipulationMetaWorld Easy 28 tasks
Success Rate96.4
9
Robot ManipulationRobomimic Proficient Human (PH)
Lift Success Rate100
6
Robotic ManipulationCALVIN 10% of Env D
No-RGB Success Rate65.2
4
Robotic ManipulationMeta-World Easy 50 tasks
Success Rate96.4
4
Robotic ManipulationMeta-World Med. 50 tasks
Success Rate85.5
4
Showing 10 of 12 rows

Other info

Follow for update