Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

HoHoNet: 360 Indoor Holistic Understanding with Latent Horizontal Features

About

We present HoHoNet, a versatile and efficient framework for holistic understanding of an indoor 360-degree panorama using a Latent Horizontal Feature (LHFeat). The compact LHFeat flattens the features along the vertical direction and has shown success in modeling per-column modality for room layout reconstruction. HoHoNet advances in two important aspects. First, the deep architecture is redesigned to run faster with improved accuracy. Second, we propose a novel horizon-to-dense module, which relaxes the per-column output shape constraint, allowing per-pixel dense prediction from LHFeat. HoHoNet is fast: It runs at 52 FPS and 110 FPS with ResNet-50 and ResNet-34 backbones respectively, for modeling dense modalities from a high-resolution $512 \times 1024$ panorama. HoHoNet is also accurate. On the tasks of layout estimation and semantic segmentation, HoHoNet achieves results on par with current state-of-the-art. On dense depth estimation, HoHoNet outperforms all the prior arts by a large margin.

Cheng Sun, Min Sun, Hwann-Tzong Chen• 2020

Related benchmarks

TaskDatasetResultRank
Semantic segmentationStanford2D3DS (3-fold cross-validation)
mIoU56.73
90
Monocular Depth EstimationStanford2D3D (test)
δ1 Accuracy90.54
71
Monocular Depth EstimationMatterport3D (test)
Delta Acc (< 1.25)87.86
48
Semantic segmentationStanford2D3D Panoramic 1.0 (Fold-1)
mIoU52
43
Semantic segmentationStanford2D3D-Panoramic (SPan) v1 (averaged by 3 folds)
mIoU52
39
Depth EstimationMatterport3D
delta194.15
35
Semantic segmentationStanford2D3D
mIoU43.3
32
Room Layout EstimationMatterportLayout (test)
2D IoU82.71
28
Semantic segmentationStructured3D (test)
mIoU66.99
21
Monocular 360 Depth EstimationMatterport3D official (test)
Delta Acc (1.25x)87.86
20
Showing 10 of 43 rows

Other info

Code

Follow for update