Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Towards Accurate Facial Landmark Detection via Cascaded Transformers

About

Accurate facial landmarks are essential prerequisites for many tasks related to human faces. In this paper, an accurate facial landmark detector is proposed based on cascaded transformers. We formulate facial landmark detection as a coordinate regression task such that the model can be trained end-to-end. With self-attention in transformers, our model can inherently exploit the structured relationships between landmarks, which would benefit landmark detection under challenging conditions such as large pose and occlusion. During cascaded refinement, our model is able to extract the most relevant image features around the target landmark for coordinate prediction, based on deformable attention mechanism, thus bringing more accurate alignment. In addition, we propose a novel decoder that refines image features and landmark positions simultaneously. With few parameter increasing, the detection performance improves further. Our model achieves new state-of-the-art performance on several standard facial landmark detection benchmarks, and shows good generalization ability in cross-dataset evaluation.

Hui Li, Zidong Guo, Seon-Min Rhee, Seungju Han, Jae-Joon Han• 2022

Related benchmarks

TaskDatasetResultRank
Facial Landmark Detection300-W (Common)
NME2.59
180
Facial Landmark Detection300-W (Fullset)
Mean Error (%)2.96
174
Facial Landmark Detection300W (Challenging)
NME4.5
159
Facial Landmark DetectionWFLW (test)
Mean Error (ME) - All4.05
122
Face Alignment300W (Challenging)
NME4.5
93
Face Alignment300W Common
NME2.59
90
Face Alignment300-W (Full)
NME2.96
66
Facial Landmark DetectionWFLW (Full)
NME (%)4.08
65
Facial Landmark Detection300W
NME2.96
52
Landmark Localization300W Common
NME2.59
44
Showing 10 of 21 rows

Other info

Follow for update