PASR: Pose-Aware 3D Shape Retrieval from Occluded Single Views
About
Single-view 3D shape retrieval is a fundamental yet challenging task that is increasingly important with the growth of available 3D data. Existing approaches largely fall into two categories: those using contrastive learning to map point cloud features into existing vision-language spaces and those that learn a common embedding space for 2D images and 3D shapes. However, these feed-forward, holistic alignments are often difficult to interpret, which in turn limits their robustness and generalization to real-world applications. To address this problem, we propose Pose-Aware 3D Shape Retrieval (PASR), a framework that formulates retrieval as a feature-level analysis-by-synthesis problem by distilling knowledge from a 2D foundation model (DINOv3) into a 3D encoder. By aligning pose-conditioned 3D projections with 2D feature maps, our method bridges the gap between real-world images and synthetic meshes. During inference, PASR performs a test-time optimization via analysis-by-synthesis, jointly searching for the shape and pose that best reconstruct the patch-level feature map of the input image. This synthesis-based optimization is inherently robust to partial occlusion and sensitive to fine-grained geometric details. PASR substantially outperforms existing methods on both clean and occluded 3D shape retrieval datasets by a wide margin. Additionally, PASR demonstrates strong multi-task capabilities, achieving robust shape retrieval, competitive pose estimation, and accurate category classification within a single framework.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| 3D Pose Estimation | Pix3D | Accuracy (π/6)89.1 | 12 | |
| 3D Pose Estimation | Pascal3D | Accuracy (π/6)80.2 | 12 | |
| 3D Shape Retrieval | Pascal3D (L0) | Top-1 Accuracy76.43 | 5 | |
| 3D Shape Retrieval | Pascal3D (L1) | Top-1 Accuracy73.21 | 5 | |
| 3D Shape Retrieval | Pascal3D (L2) | Top-1 Accuracy71.49 | 5 | |
| 3D Shape Retrieval | Pascal3D (L3) | Top-1 Accuracy63.05 | 5 | |
| 3D Shape Retrieval | Pix3D (L0) | Bed Accuracy63.83 | 5 | |
| 3D Shape Retrieval | Pix3D (L1) | Retrieval Score (Bed)61.7 | 5 | |
| 3D Shape Retrieval | Pix3D (L2) | Accuracy (Bed)57.45 | 5 | |
| 3D Shape Retrieval | Pix3D (L3) | Bed Retrieval Score52.66 | 5 |