Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

HoHoNet: 360 Indoor Holistic Understanding with Latent Horizontal Features

About

We present HoHoNet, a versatile and efficient framework for holistic understanding of an indoor 360-degree panorama using a Latent Horizontal Feature (LHFeat). The compact LHFeat flattens the features along the vertical direction and has shown success in modeling per-column modality for room layout reconstruction. HoHoNet advances in two important aspects. First, the deep architecture is redesigned to run faster with improved accuracy. Second, we propose a novel horizon-to-dense module, which relaxes the per-column output shape constraint, allowing per-pixel dense prediction from LHFeat. HoHoNet is fast: It runs at 52 FPS and 110 FPS with ResNet-50 and ResNet-34 backbones respectively, for modeling dense modalities from a high-resolution $512 \times 1024$ panorama. HoHoNet is also accurate. On the tasks of layout estimation and semantic segmentation, HoHoNet achieves results on par with current state-of-the-art. On dense depth estimation, HoHoNet outperforms all the prior arts by a large margin.

Cheng Sun, Min Sun, Hwann-Tzong Chen• 2020

Related benchmarks

TaskDatasetResultRank
Semantic segmentationStanford2D3DS (3-fold cross-validation)
mIoU56.73
90
Monocular Depth EstimationStanford2D3D (test)
δ1 Accuracy90.54
81
Semantic segmentationStanford2D3D Panoramic 1.0 (Fold-1)
mIoU52
53
Depth EstimationMatterport3D
delta194.15
50
Semantic segmentationStructured3D (val)
mIoU69.51
49
Monocular Depth EstimationMatterport3D (test)
Delta Acc (< 1.25)87.86
48
Semantic segmentationStanford2D3D-Panoramic (SPan) v1 (averaged by 3 folds)
mIoU52
39
Semantic segmentationStructured3D (test)
mIoU66.99
34
Semantic segmentationStanford2D3D
mIoU43.3
32
3D Semantic SegmentationMatterport3D (test)
mIoU32.02
32
Showing 10 of 48 rows

Other info

Code

Follow for update