Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

TwinBrainVLA: Unleashing the Potential of Generalist VLMs for Embodied Tasks via Asymmetric Mixture-of-Transformers

About

The fundamental premise of Vision-Language-Action (VLA) models is to harness the extensive general capabilities of pre-trained Vision-Language Models (VLMs) for generalized embodied intelligence. However, standard robotic fine-tuning inevitably disrupts the pre-trained feature space, leading to "catastrophic forgetting" that compromises the general visual understanding we aim to leverage. To effectively utilize the uncorrupted general capabilities of VLMs for robotic tasks, we propose TwinBrainVLA, which coordinates two isomorphic VLM pathways: a frozen generalist (also called "Left Brain") and a trainable specialist (also called "Right Brain"). Our architecture utilizes a Asymmetric Mixture-of-Transformers (AsyMoT) mechanism, enabling the Right Brain to dynamically query and fuse intact semantic knowledge from the Left Brain with proprioceptive states. This fused representation conditions a flow-matching action expert for precise continuous control. Empirical results on SimplerEnv and RoboCasa benchmarks demonstrate that by explicitly retaining general capabilities, TwinBrainVLA achieves substantial performance gains over baseline models in complex manipulation tasks.

Bin Yu, Shijie Lian, Xiaopeng Lin, Yuliang Wei, Zhaolong Shen, Changti Wu, Yuzhuo Miao, Xinming Wang, Bailing Wang, Cong Huang, Kai Chen• 2026

Related benchmarks

TaskDatasetResultRank
Robot ManipulationSimplerEnv OOD
Put Spoon on Towel Success Rate87.5
19
Robot ManipulationLIBERO (All four suites (combined))
Spatial Success Rate99.2
12
Robot ManipulationRoboCasa Tabletop official
Avg Success Rate0.546
8
Pick-&-PlaceFranka Research 3 Out-of-Domain zero-shot
Success Rate5.00e+3
5
Pick-&-PlaceFranka Research 3 Pick-All (long-horizon)
Success Rate1.00e+3
5
Pick-&-PlaceFranka Research 3 In-Domain
Success Rate93.3
5
Showing 6 of 6 rows

Other info

GitHub

Follow for update