Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

CoRe3D: Collaborative Reasoning as a Foundation for 3D Intelligence

About

Recent advances in large multimodal models suggest that explicit reasoning mechanisms play a critical role in improving model reliability, interpretability, and cross-modal alignment. While such reasoning-centric approaches have been proven effective in language and vision tasks, their extension to 3D remains underdeveloped. CoRe3D introduces a unified 3D understanding and generation reasoning framework that jointly operates over semantic and spatial abstractions, enabling high-level intent inferred from language to directly guide low-level 3D content formation. Central to this design is a spatially grounded reasoning representation that decomposes 3D latent space into localized regions, allowing the model to reason over geometry in a compositional and procedural manner. By tightly coupling semantic chain-of-thought inference with structured spatial reasoning, CoRe3D produces 3D outputs that exhibit strong local consistency and faithful alignment with linguistic descriptions.

Tianjiao Yu, Xinzhuo Li, Yifan Shen, Yuanzhe Liu, Ismini Lourentzou• 2025

Related benchmarks

TaskDatasetResultRank
Language UnderstandingMMLU
Accuracy67.6
756
Physical Commonsense ReasoningPIQA
Accuracy79.4
329
Social Commonsense ReasoningSIQA
Accuracy41.5
32
3D Object CaptioningObjaverse (held-out set)
BLEU-124.02
7
Image-to-3DObjaverse
CLIP Score0.86
5
Text-to-3DObjaverse
CLIP Score0.3
5
Showing 6 of 6 rows

Other info

Follow for update