Pose-Invariant Face Alignment with a Single CNN

About

Face alignment has witnessed substantial progress in the last decade. One of the recent focuses has been aligning a dense 3D face shape to face images with large head poses. The dominant technology used is based on the cascade of regressors, e.g., CNN, which has shown promising results. Nonetheless, the cascade of CNNs suffers from several drawbacks, e.g., lack of end-to-end training, hand-crafted features and slow training speed. To address these issues, we propose a new layer, named visualization layer, that can be integrated into the CNN architecture and enables joint optimization with different loss functions. Extensive evaluation of the proposed method on multiple datasets demonstrates state-of-the-art accuracy, while reducing the training time by more than half compared to the typical cascade of CNNs. In addition, we compare multiple CNN architectures with the visualization layer to further demonstrate the advantage of its utilization.

Amin Jourabloo, Mao Ye, Xiaoming Liu, Liu Ren• 2017

Related benchmarks

Task	Dataset	Result
Facial Landmark Detection	300-W (Common)	NME0.0543	180
Facial Landmark Detection	300-W (Fullset)	Mean Error (%)6.3	174
Facial Landmark Detection	300W (Challenging)	NME9.88	159
Face Alignment	300W (Challenging)	NME0.0988	93
Face Alignment	300W Common	NME5.43	90
Face Alignment	300W Fullset (test)	NME6.3	82
Face Alignment	300-W (Full)	NME6.3	66
Facial Landmark Detection	300-W Challenging Subset	Mean Error9.88	49
Facial Landmark Localization	300-W (Full set)	NME6.3	46
Face Alignment	AFLW 21 landmarks	NME4.45	37

Showing 10 of 14 rows

Other info

Follow for update

@wizwand_team Discord