DefVINS: Visual-Inertial Odometry for Deformable Scenes

About

Deformable scenes violate the rigidity assumptions underpinning classical visual--inertial odometry (VIO), often leading to over-fitting to local non-rigid motion or to severe camera pose drift when deformation dominates visual parallax. In this paper, we introduce DefVINS, the first visual-inertial odometry pipeline designed to operate in deformable environments. Our approach models the odometry state by decomposing it into a rigid, IMU-anchored component and a non-rigid scene warp represented by an embedded deformation graph. As a second contribution, we present VIMandala, the first benchmark containing real images and ground-truth camera poses for visual-inertial odometry in deformable scenes. In addition, we augment the synthetic Drunkard's benchmark with simulated inertial measurements to further evaluate our pipeline under controlled conditions. We also provide an observability analysis of the visual-inertial deformable odometry problem, characterizing how inertial measurements constrain camera motion and render otherwise unobservable modes identifiable in the presence of deformation. This analysis motivates the use of IMU anchoring and leads to a conditioning-based activation strategy that avoids ill-posed updates under poor excitation. Experimental results on both the synthetic Drunkard's and our real VIMandala benchmarks show that DefVINS outperforms rigid visual--inertial and non-rigid visual odometry baselines. Our source code and data will be released upon acceptance.

Samuel Cerezo, Javier Civera• 2026

Related benchmarks

Task	Dataset	Result
Visual-Inertial Odometry	Real Deformable Sequences Medium Deformation R2	ATE RMSE (mm)10.2	5
Visual-Inertial Odometry	Real Deformable Sequences Medium Deformation R3	ATE RMSE10.8	5
Visual-Inertial Odometry	Real Deformable Sequences Medium Deformation R4	ATE RMSE (mm)11.4	5
Visual-Inertial Odometry	Real Deformable Sequences R5 (High Deformation)	ATE RMSE (mm)15.6	5
Visual-Inertial Odometry	Real Deformable Sequences High Deformation R6	ATE RMSE (mm)19.8	5
Visual-Inertial SLAM	Drunkard's Dataset Medium deformation - L1	ATE RMSE (mm)9.4	5
Visual-Inertial SLAM	Drunkard's Dataset Hard deformation - L2	ATE RMSE (mm)14.3	5
Visual-Inertial SLAM	Drunkard's Dataset Extreme deformation - L3	ATE RMSE (mm)19.6	5
Visual-Inertial Odometry	Real Deformable Sequences Low Deformation R0	ATE RMSE8.1	5
Visual-Inertial Odometry	Real Deformable Sequences Low Deformation R1	ATE RMSE (mm)9	5

Showing 10 of 11 rows

Other info

Follow for update

@wizwand_team Discord