Articulation-aware Canonical Surface Mapping

About

We tackle the tasks of: 1) predicting a Canonical Surface Mapping (CSM) that indicates the mapping from 2D pixels to corresponding points on a canonical template shape, and 2) inferring the articulation and pose of the template corresponding to the input image. While previous approaches rely on keypoint supervision for learning, we present an approach that can learn without such annotations. Our key insight is that these tasks are geometrically related, and we can obtain supervisory signal via enforcing consistency among the predictions. We present results across a diverse set of animal object categories, showing that our method can learn articulation and CSM prediction from image collections using only foreground mask labels for training. We empirically show that allowing articulation helps learn more accurate CSM prediction, and that enforcing the consistency with predicted CSM is similarly critical for learning meaningful articulation.

Nilesh Kulkarni, Abhinav Gupta, David F. Fouhey, Shubham Tulsiani• 2020

Related benchmarks

Task	Dataset	Result
3D Shape Reconstruction	Animodel (test)	Chamfer Distance (Horse)2.73	12
3D Shape Reconstruction	Pascal (test)	Horse AUC37.4	12
3D Shape Reconstruction and Camera Pose Estimation	Animal Pose Horse (test)	AUC51	12
3D Shape Reconstruction and Camera Pose Estimation	Animal Pose Sheep (test)	AUC31.4	11
3D Shape Reconstruction and Camera Pose Estimation	Animal Pose Cow (test)	AUC46.2	11
Point cloud generation	Animodel-Points (Cow)	Chamfer Distance (cm)2.35	10
Point cloud generation	Animodel-Points Sheep	Chamfer Distance (cm)2.48	10
Dense Correspondence	CUB (val)	PCK@0.151	10
Keypoint Transfer	PASCAL VOC within training animal categories 1.0 (test)	PCK Transfer (Horse)44.6	9
Keypoint Transfer	CUB Bird (test)	PCK@0.142.6	8

Showing 10 of 22 rows

Other info

Follow for update

@wizwand_team Discord