Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

DA-PTQ: Drift-Aware Post-Training Quantization for Efficient Vision-Language-Action Models

About

Vision-Language-Action models (VLAs) have demonstrated strong potential for embodied AI, yet their deployment on resource-limited robots remains challenging due to high memory and computational demands. While Post-Training Quantization (PTQ) provides an efficient solution, directly applying PTQ to VLAs often results in severe performance degradation during sequential control. We identify temporal error accumulation as a key factor, where quantization perturbations at the vision-language-to-action interface are progressively amplified, leading to kinematic drift in executed trajectories. To address this issue, we propose Drift-Aware Post-Training Quantization (DA-PTQ), which formulates quantization as a drift-aware optimization problem over sequential decision processes. DA-PTQ consists of two components: (1) Cross-Space Representation Compensation, which mitigates structured distortions between multimodal representations and action space to improve action consistency, and (2) Motion-Driven Mixed-Precision Allocation, which assigns bit-widths by minimizing trajectory-level motion errors. Extensive experiments show that DA-PTQ significantly reduces kinematic drift and achieves comparable performance to full-precision models under low-bit settings, enabling practical deployment of VLAs on resource-limited robotic platforms.

Siyuan Xu, Tianshi Wang, Fengling Li, Lei Zhu, Heng Tao Shen• 2026

Related benchmarks

TaskDatasetResultRank
Robot ManipulationSimplerEnv Google Robot tasks Variant Aggregation
Average Success Rate51.7
67
Robot ManipulationSimplerEnv Google Robot Visual Matching
Pick Coke Can92.4
43
Robotic ManipulationSimplerEnv WidowX In-domain Visual Matching setting
Success Rate (Spoon on Towel)65.2
4
Robotic ManipulationSimplerEnv
Success Rate48.9
3
Showing 4 of 4 rows

Other info

Follow for update