MagicPony: Learning Articulated 3D Animals in the Wild

About

We consider the problem of predicting the 3D shape, articulation, viewpoint, texture, and lighting of an articulated animal like a horse given a single test image as input. We present a new method, dubbed MagicPony, that learns this predictor purely from in-the-wild single-view images of the object category, with minimal assumptions about the topology of deformation. At its core is an implicit-explicit representation of articulated shape and appearance, combining the strengths of neural fields and meshes. In order to help the model understand an object's shape and pose, we distil the knowledge captured by an off-the-shelf self-supervised vision transformer and fuse it into the 3D model. To overcome local optima in viewpoint estimation, we further introduce a new viewpoint sampling scheme that comes at no additional training cost. MagicPony outperforms prior work on this challenging task and demonstrates excellent generalisation in reconstructing art, despite the fact that it is only trained on real images.

Shangzhe Wu, Ruining Li, Tomas Jakab, Christian Rupprecht, Andrea Vedaldi• 2022

Related benchmarks

Task	Dataset	Result
Semantic Correspondence	SPair-71k (test)	--	146
3D Shape Reconstruction	Animodel (test)	Chamfer Distance (Horse)2.58	12
Point cloud generation	Animodel-Points (Cow)	Chamfer Distance (cm)2.53	10
Point cloud generation	Animodel-Points Sheep	Chamfer Distance (cm)3	10
Keypoint Transfer	CUB Bird (test)	PCK@0.155.5	8
Point Cloud Reconstruction	Animodel-Points	RMS CD (cm)11.19	8
Keypoint Transfer	CUB Bird excluding 50 aquatic bird classes (test)	PCK@0.163.5	6
3D Animal Pose and Shape Estimation	Animal3D (test)	--	6
Keypoint Transfer	PASCAL VOC 12 (test)	PCK (Horse)42.9	5
3D Animal Reconstruction	PASCAL VOC Horses 10 (test)	KT-PCK@0.142.9	4

Showing 10 of 17 rows

Other info

Code

Follow for update

@wizwand_team Discord