Learning Deep Representation for Face Alignment with Auxiliary Attributes
About
In this study, we show that landmark detection or face alignment task is not a single and independent problem. Instead, its robustness can be greatly improved with auxiliary information. Specifically, we jointly optimize landmark detection together with the recognition of heterogeneous but subtly correlated facial attributes, such as gender, expression, and appearance attributes. This is non-trivial since different attribute inference tasks have different learning difficulties and convergence rates. To address this problem, we formulate a novel tasks-constrained deep model, which not only learns the inter-task correlation but also employs dynamic task coefficients to facilitate the optimization convergence when learning multiple complex tasks. Extensive evaluations show that the proposed task-constrained learning (i) outperforms existing face alignment methods, especially in dealing with faces with severe occlusion and pose variation, and (ii) reduces model complexity drastically compared to the state-of-the-art methods based on cascaded deep model.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Facial Landmark Detection | 300-W (Common) | NME0.048 | 180 | |
| Facial Landmark Detection | 300-W (Fullset) | Mean Error (%)5.54 | 174 | |
| Face Alignment | 300W (Challenging) | NME8.6 | 93 | |
| Face Alignment | 300W Common | NME4.8 | 90 | |
| Face Alignment | 300W Fullset (test) | NME5.54 | 82 | |
| Face Alignment | COFW (test) | NME8.05 | 72 | |
| Face Alignment | 300-W (Full) | NME5.54 | 66 | |
| Landmark Localization | AFLW (test) | NME (%)7.65 | 54 | |
| Facial Landmark Detection | 300W | -- | 52 | |
| Facial Landmark Detection | 300-W Challenging Subset | Mean Error8.6 | 49 |