Img2CADSeq: Image-to-CAD Generation via Sequence-Based Diffusion
About
Boundary Representation (BRep) is the standard format for Computer-Aided Design (CAD), yet reconstructing high-quality BReps from single-view images remains challenging due to the complexity of topological constraints and operation sequences. We present Img2CADSeq, a multi-stage pipeline that overcomes these limitations by encoding CAD sequences into a three-level hierarchical codebook. Guided by an importance prioritization, this strategy values profiles over details, compressing long sequences into a stable discrete latent space. To bridge the modality gap, we leverage a coarse-to-fine point cloud intermediate, aligning 2D visual features with 3D CAD sequences via contrastive learning to condition a VQ-Diffusion model. Supported by newly introduced CAD-220K and PrintCAD datasets, our approach ensures robust industrial domain adaptation. Extensive experiments demonstrate that Img2CADSeq significantly outperforms state-of-the-art methods, producing standard STEP files that can be directly used in commercial CAD software.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Image-to-CAD generation | Image-conditional CAD dataset | Chamfer Distance (CD)1.21 | 6 | |
| Unconditional CAD Generation | CAD Models Unconditional (test) | MMD0.958 | 6 | |
| Point cloud-conditioned CAD generation | Clean Point Cloud (test) | Accuracy Error6.49 | 4 |