Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

ActMVS: Active Scene Reconstruction with Monocular Multi-View Stereo

About

Active scene reconstruction enables robots/UAVs to autonomously plan trajectories and reconstruct environments without costly manual data acquisition. Unlike passive methods, active reconstruction requires real-time construction of high-confidence occupancy maps for collision-free navigation. Existing approaches rely on depth sensors for occupancy map updates, increasing platform cost and weight. To advance spatial intelligence, we aim for a vision-only monocular solution. However, current monocular scene reconstruction methods operate offline and fail to deliver globally consistent dense depth at the frame rates required for robots/UAVs navigation. To bridge this gap, we introduce ActMVS, the first framework for monocular active reconstruction. Our framework integrates a view factor graph construction for informed Multi-View Stereo depth prediction, along with a global depth optimization, to enable the online generation of high-quality, globally consistent dense depth maps. This enables monocular robots/UAVs to maintain reliable occupancy maps for safe trajectory planning during reconstruction. Experiments on Replica datasets demonstrate performance competitive with RGB-D methods. Our code and data are available at https://github.com/TrickyGo/ActMVS.

Guo Pu, Yixuan Han, Zhouhui Lian• 2026

Related benchmarks

TaskDatasetResultRank
Mesh ReconstructionReplica Room 0--
21
Mesh ReconstructionReplica Room 1
Completion (cm)2.081
16
Mesh ReconstructionReplica Office 0
Completion (cm)2
16
Mesh ReconstructionReplica Room 2
Completion Ratio [< δ]52.31
16
RenderingReplica Room 0
PSNR27.17
11
RenderingReplica Room 1
PSNR28.05
11
RenderingReplica Office 1
PSNR31.96
11
RenderingReplica Office 4
PSNR29.99
11
RenderingReplica Room 2
PSNR28.69
11
RenderingReplica Office 0
PSNR31.63
11
Showing 10 of 16 rows

Other info

Follow for update