Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Semantic Alignment: Finding Semantically Consistent Ground-truth for Facial Landmark Detection

About

Recently, deep learning based facial landmark detection has achieved great success. Despite this, we notice that the semantic ambiguity greatly degrades the detection performance. Specifically, the semantic ambiguity means that some landmarks (e.g. those evenly distributed along the face contour) do not have clear and accurate definition, causing inconsistent annotations by annotators. Accordingly, these inconsistent annotations, which are usually provided by public databases, commonly work as the ground-truth to supervise network training, leading to the degraded accuracy. To our knowledge, little research has investigated this problem. In this paper, we propose a novel probabilistic model which introduces a latent variable, i.e. the 'real' ground-truth which is semantically consistent, to optimize. This framework couples two parts (1) training landmark detection CNN and (2) searching the 'real' ground-truth. These two parts are alternatively optimized: the searched 'real' ground-truth supervises the CNN training; and the trained CNN assists the searching of 'real' ground-truth. In addition, to recover the unconfidently predicted landmarks due to occlusion and low quality, we propose a global heatmap correction unit (GHCU) to correct outliers by considering the global face shape as a constraint. Extensive experiments on both image-based (300W and AFLW) and video-based (300-VW) databases demonstrate that our method effectively improves the landmark detection accuracy and achieves the state of the art performance.

Zhiwei Liu, Xiangyu Zhu, Guosheng Hu, Haiyun Guo, Ming Tang, Zhen Lei, Neil M. Robertson, Jinqiao Wang• 2019

Related benchmarks

TaskDatasetResultRank
Facial Landmark DetectionAFLW Full
NME0.016
101
Facial Landmark Detection300-W public Challenging inter-pupil normalization (test)
NME6.38
46
Landmark Localization300W Common
NME3.45
44
Facial Landmark Detection300W inter-pupil distance normalized (Full set)
NME4.02
19
Facial Landmark Detection300W inter-pupil distance normalized (Common set)
NME3.45
19
Landmark Detection300W Challenge
NME6.38
17
Landmark Detection300W (full)
NME4.02
17
Face Alignment300W Challenging 68 landmarks
NME6.38
14
Face Alignment300W Full set 68 landmarks
NME4.02
13
Face Alignment300W Common 68 landmarks
NME (300W Common)3.45
13
Showing 10 of 14 rows

Other info

Follow for update