Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

ViewSRD: 3D Visual Grounding via Structured Multi-View Decomposition

About

3D visual grounding aims to identify and localize objects in a 3D space based on textual descriptions. However, existing methods struggle with disentangling targets from anchors in complex multi-anchor queries and resolving inconsistencies in spatial descriptions caused by perspective variations. To tackle these challenges, we propose ViewSRD, a framework that formulates 3D visual grounding as a structured multi-view decomposition process. First, the Simple Relation Decoupling (SRD) module restructures complex multi-anchor queries into a set of targeted single-anchor statements, generating a structured set of perspective-aware descriptions that clarify positional relationships. These decomposed representations serve as the foundation for the Multi-view Textual-Scene Interaction (Multi-TSI) module, which integrates textual and scene features across multiple viewpoints using shared, Cross-modal Consistent View Tokens (CCVTs) to preserve spatial correlations. Finally, a Textual-Scene Reasoning module synthesizes multi-view predictions into a unified and robust 3D visual grounding. Experiments on 3D visual grounding datasets show that ViewSRD significantly outperforms state-of-the-art methods, particularly in complex queries requiring precise spatial differentiation. Code is available at https://github.com/visualjason/ViewSRD.

Ronggang Huang, Haoxin Yang, Yan Cai, Xuemiao Xu, Huaidong Zhang, Shengfeng He• 2025

Related benchmarks

TaskDatasetResultRank
3D Visual GroundingScanRefer
Acc@0.529
142
3D Visual GroundingNr3D (test)
Overall Success Rate69.9
88
3D Visual GroundingNr3D
Overall Success Rate69.9
83
3D Visual GroundingSr3D (test)
Overall Accuracy76
73
3D Visual GroundingScanRefer Unique
Acc@0.25 (IoU=0.25)82.1
41
3D Visual GroundingScanRefer Overall
Acc @ 0.2545.4
41
3D Visual GroundingScanRefer (test)
Unique Accuracy82.1
21
Showing 7 of 7 rows

Other info

Follow for update