Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

TIGaussian: Disentangle Gaussians for Spatial-Awared Text-Image-3D Alignment

About

While visual-language models have profoundly linked features between texts and images, the incorporation of 3D modality data, such as point clouds and 3D Gaussians, further enables pretraining for 3D-related tasks, e.g., cross-modal retrieval, zero-shot classification, and scene recognition. As challenges remain in extracting 3D modal features and bridging the gap between different modalities, we propose TIGaussian, a framework that harnesses 3D Gaussian Splatting (3DGS) characteristics to strengthen cross-modality alignment through multi-branch 3DGS tokenizer and modality-specific 3D feature alignment strategies. Specifically, our multi-branch 3DGS tokenizer decouples the intrinsic properties of 3DGS structures into compact latent representations, enabling more generalizable feature extraction. To further bridge the modality gap, we develop a bidirectional cross-modal alignment strategies: a multi-view feature fusion mechanism that leverages diffusion priors to resolve perspective ambiguity in image-3D alignment, while a text-3D projection module adaptively maps 3D features to text embedding space for better text-3D alignment. Extensive experiments on various datasets demonstrate the state-of-the-art performance of TIGaussian in multiple tasks.

Jiarun Liu, Qifeng Chen, Yiru Zhao, Minghua Liu, Baorui Ma, Sheng Yang• 2026

Related benchmarks

TaskDatasetResultRank
3D Object ClassificationObjaverse-LVIS (test)
Top-1 Accuracy50.1
95
object recognitionObjaverse LVIS
Top-1 Acc41.76
25
Scene recognitionSUN RGB-D Scene (test)--
25
3D-Image RetrievalObjaverse LVIS
R@154.11
8
Object ClassificationObjaverse LVIS 1.0 (evaluation)
Top-1 Acc46.7
8
ClassificationABO
Top-1 Acc61.7
5
Image-3D RetrievalABO
Top-1 Accuracy66.15
4
Text-3D RetrievalObjaverse LVIS
Top-1 Accuracy21.2
4
Text-3D RetrievalABO
Top-1 Accuracy15.87
4
Showing 9 of 9 rows

Other info

Follow for update