Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

FACE: A Face-based Autoregressive Representation for High-Fidelity and Efficient Mesh Generation

About

Autoregressive models for 3D mesh generation suffer from a fundamental limitation: they flatten meshes into long vertex-coordinate sequences. This results in prohibitive computational costs, hindering the efficient synthesis of high-fidelity geometry. We argue this bottleneck stems from operating at the wrong semantic level. We introduce FACE, a novel Autoregressive Autoencoder (ARAE) framework that reconceptualizes the task by generating meshes at the face level. Our one-face-one-token strategy treats each triangle face, the fundamental building block of a mesh, as a single, unified token. This simple yet powerful design reduces the sequence length by a factor of nine, leading to an unprecedented compression ratio of 0.11, halving the previous state-of-the-art. This dramatic efficiency gain does not compromise quality; by pairing our face-level decoder with a powerful VecSet encoder, FACE achieves state-of-the-art reconstruction quality on standard benchmarks. The versatility of the learned latent space is further demonstrated by training a latent diffusion model that achieves high-fidelity, single-image-to-mesh generation. FACE provides a simple, scalable, and powerful paradigm that lowers the barrier to high-quality structured 3D content creation.

Hanxiao Wang, Yuan-Chen Guo, Ying-Tian Liu, Zi-Xin Zou, Biao Zhang, Weize Quan, Ding Liang, Yan-Pei Cao, Dong-Ming Yan• 2026

Related benchmarks

TaskDatasetResultRank
Mesh ReconstructionToys4k
Chamfer Distance0.033
16
Mesh Tokenization3D Mesh Representation
Compression Ratio0.11
12
Mesh ReconstructionObjaverse (test)
Hausdorff Distance0.09
5
Mesh ReconstructionFamous
Hausdorff Distance0.077
5
Showing 4 of 4 rows

Other info

Follow for update