Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Learning to Cluster Faces via Transformer

About

Face clustering is a useful tool for applications like automatic face annotation and retrieval. The main challenge is that it is difficult to cluster images from the same identity with different face poses, occlusions, and image quality. Traditional clustering methods usually ignore the relationship between individual images and their neighbors which may contain useful context information. In this paper, we repurpose the well-known Transformer and introduce a Face Transformer for supervised face clustering. In Face Transformer, we decompose the face clustering into two steps: relation encoding and linkage predicting. Specifically, given a face image, a \textbf{relation encoder} module aggregates local context information from its neighbors and a \textbf{linkage predictor} module judges whether a pair of images belong to the same cluster or not. In the local linkage graph view, Face Transformer can generate more robust node and edge representations compared to existing methods. Experiments on both MS-Celeb-1M and DeepFashion show that our method achieves state-of-the-art performance, e.g., 91.12\% in pairwise F-score on MS-Celeb-1M.

Jinxing Ye, Xioajiang Peng, Baigui Sun, Kai Wang, Xiuyu Sun, Hao Li, Hanqing Wu• 2021

Related benchmarks

TaskDatasetResultRank
Face ClusteringMS1M 584K unlabeled (test)
FP91.12
20
Face ClusteringMS1M 4.05M unlabeled (test)
FP Rate84.1
20
Face ClusteringMS1M 1.74M unlabeled (test)
FP Rate89.07
20
Face ClusteringMS1M 2.89M unlabeled (test)
FP86.78
20
Face ClusteringMS1M 5.21M unlabeled (test)
FP Rate83.86
20
Showing 5 of 5 rows

Other info

Follow for update