Two Stream 3D Semantic Scene Completion

About

Inferring the 3D geometry and the semantic meaning of surfaces, which are occluded, is a very challenging task. Recently, a first end-to-end learning approach has been proposed that completes a scene from a single depth image. The approach voxelizes the scene and predicts for each voxel if it is occupied and, if it is occupied, the semantic class label. In this work, we propose a two stream approach that leverages depth information and semantic information, which is inferred from the RGB image, for this task. The approach constructs an incomplete 3D semantic tensor, which uses a compact three-channel encoding for the inferred semantic information, and uses a 3D CNN to infer the complete 3D semantic tensor. In our experimental evaluation, we show that the proposed two stream approach substantially outperforms the state-of-the-art for semantic scene completion.

Martin Garbade, Yueh-Tung Chen, Johann Sawatzky, Juergen Gall• 2018

Related benchmarks

Task	Dataset	Result
Semantic Scene Completion	SemanticKITTI (val)	mIoU (Mean IoU)17.7	119
Semantic Scene Completion	NYU v2 (test)	Ceiling Error9.7	81
Semantic Occupancy Prediction	SemanticKITTI (test)	mIoU50.6	67
Scene Completion	NYUCAD (test)	mIoU76.9	60
Semantic Scene Completion	SemanticKITTI official (test)	mIoU17.7	50
Scene Completion	NYU dataset (test)	mIoU60	50
Scene Completion	NYU v2 (test)	mIoU60.7	48
Semantic Scene Completion	NYU (test)	Ceiling Error9.7	46
Semantic Scene Completion	NYUCAD (test)	Error Rate (Ceiling)25.9	44
Scene Completion	NYUCAD	mIoU76.1	32

Showing 10 of 19 rows

Other info

Follow for update

@wizwand_team Discord