Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

DAM-VLA: A Dynamic Action Model-Based Vision-Language-Action Framework for Robot Manipulation

About

In dynamic environments such as warehouses, hospitals, and homes, robots must seamlessly transition between gross motion and precise manipulations to complete complex tasks. However, current Vision-Language-Action (VLA) frameworks, largely adapted from pre-trained Vision-Language Models (VLMs), often struggle to reconcile general task adaptability with the specialized precision required for intricate manipulation. To address this challenge, we propose DAM-VLA, a dynamic action model-based VLA framework. DAM-VLA integrates VLM reasoning with diffusion-based action models specialized for arm and gripper control. Specifically, it introduces (i) an action routing mechanism, using task-specific visual and linguistic cues to select appropriate action models (e.g., arm movement or gripper manipulation), (ii) a dynamic action model that fuses high-level VLM cognition with low-level visual features to predict actions, and (iii) a dual-scale action weighting mechanism that enables dynamic coordination between the arm-movement and gripper-manipulation models. Across extensive evaluations, DAM-VLA achieves superior success rates compared to state-of-the-art VLA methods in simulated (SIMPLER, FurnitureBench) and real-world settings, showing robust generalization from standard pick-and-place to demanding long-horizon and contact-rich tasks.

Xiongfeng Peng, Jiaqian Yu, Dingzhe Li, Yixiang Jin, Lu Xu, Yamin Mao, Chao Zhang, Weiming Li, Sujin Jang, Dongwook Lee, Daehyun Ji• 2026

Related benchmarks

TaskDatasetResultRank
Robotic ManipulationSIMPLER Visual Matching WidowX robot
Put Spoon on Towel Score88
51
Robotic ManipulationSIMPLER Google Robot VA
Pick Up Coke Can Success Rate98
35
Robot ManipulationSIMPLER Google robot, Visual Matching setting (test)
Success Rate (PCC)96
10
Pick-&-PlaceReal-world Robot Pick-and-place Average
Success Rate86.8
5
One-Leg assemblyFurnitureBench
Step 1 Success Rate100
3
Pick-&-PlaceReal-world In-Distribution
Success Rate91.4
2
Pick-&-PlaceReal-world Out-of-Distribution
Success Rate82.2
2
Showing 7 of 7 rows

Other info

Follow for update