Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Semantic Ray: Learning a Generalizable Semantic Field with Cross-Reprojection Attention

About

In this paper, we aim to learn a semantic radiance field from multiple scenes that is accurate, efficient and generalizable. While most existing NeRFs target at the tasks of neural scene rendering, image synthesis and multi-view reconstruction, there are a few attempts such as Semantic-NeRF that explore to learn high-level semantic understanding with the NeRF structure. However, Semantic-NeRF simultaneously learns color and semantic label from a single ray with multiple heads, where the single ray fails to provide rich semantic information. As a result, Semantic NeRF relies on positional encoding and needs to train one specific model for each scene. To address this, we propose Semantic Ray (S-Ray) to fully exploit semantic information along the ray direction from its multi-view reprojections. As directly performing dense attention over multi-view reprojected rays would suffer from heavy computational cost, we design a Cross-Reprojection Attention module with consecutive intra-view radial and cross-view sparse attentions, which decomposes contextual information along reprojected rays and cross multiple views and then collects dense connections by stacking the modules. Experiments show that our S-Ray is able to learn from multiple scenes, and it presents strong generalization ability to adapt to unseen scenes.

Fangfu Liu, Chubin Zhang, Yu Zheng, Yueqi Duan• 2023

Related benchmarks

TaskDatasetResultRank
Novel View SynthesisScanNet
PSNR29.27
58
Semantic View Synthesis (Novel View)ScanNet V2 (val)
mIoU56
12
Semantic segmentationReplica synthetic (test)
Total Acc96.38
9
Semantic segmentationScanNet real (test)
Total Accuracy98.2
9
Showing 4 of 4 rows

Other info

Follow for update