Beyond Spatial Compression: Interface-Centric Generative States for Open-World 3D Structure
About
Current 3D tokenizers largely treat representation as spatial compression: compact codes reconstruct surface geometry, but leave component ownership and attachment validity implicit. In open-world assets with intersecting components, noisy topology, and weak canonical structure, this creates a representation mismatch: local shape, component identity, and assembly relations become entangled in a latent stream and are not natively addressable during decoding. We formulate an alternative view, interface-centric generative states, in which tokenization constructs an operational state rather than a passive compressed code. The state exposes local geometry, component ownership, and attachment validity as variables that can be queried, constrained, and repaired during decoding. We instantiate this formulation with Component-Conditioned Canonical Local Tokens (C2LT-3D), factorizing representation into canonical local geometry, partition-conditioned context, and relational seam variables. Each factor targets a distinct failure mode of compression-centric tokens: pose leakage, cross-component interference, or invalid local attachment. This exposed state supports attachment validation, latent structural repair, targeted intervention, and constrained serialization without a separate post-hoc structure recovery module. Trained on single-object CAD models and evaluated zero-shot on open-world multi-component assets, C2LT-3D improves structural robustness and shows that its latent variables remain actionable under adversarial attachment settings. These results suggest that open-world 3D generative representations should be evaluated not only by reconstruction fidelity, but by whether their discrete states remain operational for assembly-level structural reasoning.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Latent Structural Repair | Objaverse-LVIS All | Validity@162.02 | 5 | |
| Latent Structural Repair | Objaverse-LVIS Hard (source) | Validity@166.69 | 5 | |
| Latent Structural Repair | Objaverse-LVIS Heur.-Fail | Validity@166.75 | 5 | |
| Repair Ranking | Edge Bank Hard 1,360 tasks | Valid@166.69 | 5 | |
| Repair Ranking | Edge Bank Heur.-Fail 1,194 tasks | Valid@166.75 | 5 | |
| 3D Object Generation | Objaverse-LVIS 5,000-object (geometry-clean) | Chamfer Distance0.0369 | 3 | |
| 3D Object Reconstruction | Objaverse-LVIS 1,024-object geometry-clean filtered | Chamfer Distance0.0268 | 3 | |
| Structural Reconstruction | Objaverse-LVIS (geometry-clean) | Chamfer Distance0.0268 | 3 |