Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

3D FlowMatch Actor: Unified 3D Policy for Single- and Dual-Arm Manipulation

About

We present 3D FlowMatch Actor (3DFA), a 3D policy architecture for robot manipulation that combines flow matching for trajectory prediction with 3D pretrained visual scene representations for learning from demonstration. 3DFA leverages 3D relative attention between action and visual tokens during action denoising, building on prior work in 3D diffusion-based single-arm policy learning. Through a combination of flow matching and targeted system-level and architectural optimizations, 3DFA achieves over 30x faster training and inference than previous 3D diffusion-based policies, without sacrificing performance. On the bimanual PerAct2 benchmark, it establishes a new state of the art, outperforming the next-best method by an absolute margin of 41.4%. In extensive real-world evaluations, it surpasses strong baselines with up to 1000x more parameters and significantly more pretraining. In unimanual settings, it sets a new state of the art on 74 RLBench tasks by directly predicting dense end-effector trajectories, eliminating the need for motion planning. Comprehensive ablation studies underscore the importance of our design choices for both policy effectiveness and efficiency.

Nikolaos Gkanatsios, Jiahe Xu, Matthew Bronars, Arsalan Mousavian, Tsung-Wei Ke, Katerina Fragkiadaki• 2025

Related benchmarks

TaskDatasetResultRank
Bimanual ManipulationRLBench 2
Push Box Success Rate84
20
Average performance across tasksReal-world Bimanual Manipulation Tasks (test)
Success Rate35
8
HandoverReal-world Bimanual Manipulation Tasks (test)
Success Rate30
8
Pick up PlateReal-world Bimanual Manipulation Tasks (test)
Success Rate40
8
Articulated ManipulationShapeNet PartNet-Mobility seen objects
Bottle Success Rate4
6
Plate-LiftingPerAct seen objects 2
Plate Success Rate71
6
Plate-LiftingPerAct2 unseen objects
Plate Success Rate68
6
Articulated ManipulationShapeNet PartNet-Mobility unseen objects
Bottle Success Rate2
6
Edge-PushingShapeNet PartNet-Mobility seen objects
Success Rate (Bowl)3
6
Edge-PushingShapeNet PartNet-Mobility unseen objects
Success Rate (Bowl)0.00e+0
6
Showing 10 of 16 rows

Other info

Follow for update