Unsupervised learning of object landmarks by factorized spatial embeddings

About

Learning automatically the structure of object categories remains an important open problem in computer vision. In this paper, we propose a novel unsupervised approach that can discover and learn landmarks in object categories, thus characterizing their structure. Our approach is based on factorizing image deformations, as induced by a viewpoint change or an object deformation, by learning a deep neural network that detects landmarks consistently with such visual effects. Furthermore, we show that the learned landmarks establish meaningful correspondences between different object instances in a category without having to impose this requirement explicitly. We assess the method qualitatively on a variety of object types, natural and man-made. We also show that our unsupervised landmarks are highly predictive of manually-annotated landmarks in face benchmark datasets, and can be used to regress these with a high degree of accuracy.

James Thewlis, Hakan Bilen, Andrea Vedaldi• 2017

Related benchmarks

Task	Dataset	Result
Landmark Localization	AFLW (test)	NME (%)10.53	54
Landmark Prediction	MAFL (test)	Mean Error (%)5.33	38
Facial Landmark Detection	MAFL (test)	Normalised MSE (%)6.67	30
Landmark Regression	MAFL (test)	MSE (%)6.67	28
Landmark Regression	wild CelebA (test)	Mean Normalized L2 Error31.3	17
Landmark Detection	CelebA Wild (K=8) (test)	Normalized L2 Distance (%)31.3	14
Landmark Prediction	300-W (test)	Landmark Prediction Error9.3	12
Keypoint Detection	Human3.6M	Mean L2 Error7.51	11
Keypoint Detection	CUB-200-2011 all	Mean L2 Error30.1	11
Landmark Prediction	Cat head (test)	Mean Error (%)0.2676	10

Showing 10 of 22 rows

Other info

Follow for update

@wizwand_team Discord