TUSK: Task-Agnostic Unsupervised Keypoints

About

Existing unsupervised methods for keypoint learning rely heavily on the assumption that a specific keypoint type (e.g. elbow, digit, abstract geometric shape) appears only once in an image. This greatly limits their applicability, as each instance must be isolated before applying the method-an issue that is never discussed or evaluated. We thus propose a novel method to learn Task-agnostic, UnSupervised Keypoints (TUSK) which can deal with multiple instances. To achieve this, instead of the commonly-used strategy of detecting multiple heatmaps, each dedicated to a specific keypoint type, we use a single heatmap for detection, and enable unsupervised learning of keypoint types through clustering. Specifically, we encode semantics into the keypoints by teaching them to reconstruct images from a sparse set of keypoints and their descriptors, where the descriptors are forced to form distinct clusters in feature space around learned prototypes. This makes our approach amenable to a wider range of tasks than any previous unsupervised keypoint method: we show experiments on multiple-instance detection and classification, object discovery, and landmark detection-all unsupervised-with performance on par with the state of the art, while also being able to deal with multiple instances.

Yuhe Jin, Weiwei Sun, Jan Hosang, Eduard Trulls, Kwang Moo Yi• 2022

Related benchmarks

Task	Dataset	Result
Landmark Regression	wild CelebA (test)	Mean Normalized L2 Error18.49	17
Object Detection	MNIST-Hard (test)	Localization Accuracy99.9	5
Landmark Detection	Human3.6M (test)	Normalized Error6.88	4
Object Discovery	CLEVR6 (test)	ARI0.983	4
Object Discovery	Tetrominoes (test)	ARI99.7	3
Unsupervised Property Classification	CLEVR6 (test)	Shape Acc46.8	1
Unsupervised Property Classification	Tetrominoes (test)	Shape Acc91.3	1

Showing 7 of 7 rows

Other info

Follow for update

@wizwand_team Discord