Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Consistent Instance Field for Dynamic Scene Understanding

About

We introduce Consistent Instance Field, a continuous and probabilistic spatio-temporal representation for dynamic scene understanding. Unlike prior methods that rely on discrete tracking or view-dependent features, our approach disentangles visibility from persistent object identity by modeling each space-time point with an occupancy probability and a conditional instance distribution. To realize this, we introduce a novel instance-embedded representation based on deformable 3D Gaussians, which jointly encode radiance and semantic information and are learned directly from input RGB images and instance masks through differentiable rasterization. Furthermore, we introduce new mechanisms to calibrate per-Gaussian identities and resample Gaussians toward semantically active regions, ensuring consistent instance representations across space and time. Experiments on HyperNeRF and Neu3D datasets demonstrate that our method significantly outperforms state-of-the-art methods on novel-view panoptic segmentation and open-vocabulary 4D querying tasks.

Junyi Wu, Van Nguyen Nguyen, Benjamin Planche, Jiachen Tao, Changchang Sun, Zhongpai Gao, Zhenghao Zhao, Anwesa Choudhuri, Gengyu Zhang, Meng Zheng, Feiran Wang, Terrence Chen, Yan Yan, Ziyan Wu• 2025

Related benchmarks

TaskDatasetResultRank
Open-vocabulary 4D queryingHyperNeRF americano scene
Mean Accuracy99.02
6
Open-vocabulary 4D queryingHyperNeRF espresso
mAcc99.73
6
Novel-view Panoptic SegmentationNeu3D coffee martini
mAcc (Pixel)96.07
5
Novel-view Panoptic SegmentationNeu3D cook spinach
mAcc (Pixel)96.63
5
Novel-view Panoptic SegmentationNeu3D cut roasted beef
Pixel Accuracy (mAcc-pix)95.12
5
Novel-view Panoptic SegmentationNeu3D flame salmon
mAcc (Pixel)91.31
5
Novel-view Panoptic SegmentationNeu3D flame steak
Pixel Acc95.31
5
Novel-view Panoptic SegmentationNeu3D sear steak
mAcc (Pixel)95.36
5
Panoptic SegmentationHyperNeRF americano
Pixel Accuracy98.4
5
Panoptic SegmentationHyperNeRF split-cookie
mAcc (pix)97.93
5
Showing 10 of 14 rows

Other info

Follow for update