Scaling Mesh Generation via Compressive Tokenization
About
We propose a compressive yet effective mesh representation, Blocked and Patchified Tokenization (BPT), facilitating the generation of meshes exceeding 8k faces. BPT compresses mesh sequences by employing block-wise indexing and patch aggregation, reducing their length by approximately 75\% compared to the original sequences. This compression milestone unlocks the potential to utilize mesh data with significantly more faces, thereby enhancing detail richness and improving generation robustness. Empowered with the BPT, we have built a foundation mesh generative model training on scaled mesh data to support flexible control for point clouds and images. Our model demonstrates the capability to generate meshes with intricate details and accurate topology, achieving SoTA performance on mesh generation and reaching the level for direct product usage.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| 3D Mesh Generation | Objaverse | Chamfer Distance0.027 | 18 | |
| Mesh Reconstruction | Toys4k | Chamfer Distance0.037 | 16 | |
| Mesh Tokenization | 3D Mesh Representation | Compression Ratio0.26 | 12 | |
| 3D Object Generation | ShapeNet | Chamfer Distance (CD)0.003 | 10 | |
| 3D Mesh Reconstruction | Artistic meshes | Chamfer Distance (L2)0.052 | 10 | |
| Mesh Generation | Hunyuan3D Dense Meshes 2.5 | Chamfer Distance (CD)0.109 | 7 | |
| Mesh Generation | Toys4k (Artist Meshes) | Chamfer Distance (CD)0.046 | 7 | |
| Mesh Reconstruction | Hunyuan3D Dense Meshes 2.5 (test) | CD0.109 | 7 | |
| Mesh Reconstruction | Toys4k Artist Meshes (test) | Chamfer Distance (CD)0.046 | 7 | |
| Mesh Tokenization | Mesh Sequences | Compression Ratio0.26 | 7 |