Endo-Depth-and-Motion: Reconstruction and Tracking in Endoscopic Videos using Depth Networks and Photometric Constraints

About

Estimating a scene reconstruction and the camera motion from in-body videos is challenging due to several factors, e.g. the deformation of in-body cavities or the lack of texture. In this paper we present Endo-Depth-and-Motion, a pipeline that estimates the 6-degrees-of-freedom camera pose and dense 3D scene models from monocular endoscopic videos. Our approach leverages recent advances in self-supervised depth networks to generate pseudo-RGBD frames, then tracks the camera pose using photometric residuals and fuses the registered depth maps in a volumetric representation. We present an extensive experimental evaluation in the public dataset Hamlyn, showing high-quality results and comparisons against relevant baselines. We also release all models and code for future comparisons.

David Recasens, Jos\'e Lamarca, Jos\'e M. F\'acil, J. M. M. Montiel, Javier Civera• 2021

Related benchmarks

Task	Dataset	Result
Depth Estimation	SCARED (test)	Abs Rel0.203	28
Trajectory Estimation	Drunkard's Dataset Level 0	Success Rate [%]100	11
Trajectory Estimation	Drunkard's Dataset Level 1	Frame Accuracy1	11
Trajectory Estimation	Drunkard's Dataset Level 2	Frame Success Rate1	11
Trajectory Estimation	Drunkard's Dataset Level 3	Frame Accuracy (%)1	11
Depth Estimation	Hamlyn 22 videos	Abs Rel0.216	10
Novel View Synthesis	C3VD average across ten scenes	PSNR18.13	10
Rendering	C3VD high-definition (test)	PSNR18.13	8
Camera Tracking	C3VD high-definition (test)	ATE (mm)1.25	8
Depth Reconstruction	C3VD high-definition (test)	RMSE (mm)5.1	8

Showing 10 of 17 rows

Other info

Follow for update

@wizwand_team Discord