FaceNet2ExpNet: Regularizing a Deep Face Recognition Net for Expression Recognition
About
Relatively small data sets available for expression recognition research make the training of deep networks for expression recognition very challenging. Although fine-tuning can partially alleviate the issue, the performance is still below acceptable levels as the deep features probably contain redun- dant information from the pre-trained domain. In this paper, we present FaceNet2ExpNet, a novel idea to train an expression recognition network based on static images. We first propose a new distribution function to model the high-level neurons of the expression network. Based on this, a two-stage training algorithm is carefully designed. In the pre-training stage, we train the convolutional layers of the expression net, regularized by the face net; In the refining stage, we append fully- connected layers to the pre-trained convolutional layers and train the whole network jointly. Visualization shows that the model trained with our method captures improved high-level expression semantics. Evaluations on four public expression databases, CK+, Oulu-CASIA, TFD, and SFEW demonstrate that our method achieves better results than state-of-the-art.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Facial Expression Recognition | CK+ | Accuracy98.6 | 72 | |
| Facial Expression Recognition | OuluCASIA | Accuracy87.71 | 17 | |
| Facial Expression Recognition | CK+ Six Classes (10-fold val) | Accuracy98.6 | 11 | |
| Facial Expression Recognition | CK+ Eight Classes (10-fold cross val) | Avg Accuracy96.8 | 11 | |
| Expression Recognition | SFEW (val) | Average Accuracy55.15 | 10 | |
| Facial Expression Recognition | Oulu-CASIA Strong illumination VIS (test) | Accuracy87.71 | 10 | |
| Facial Expression Recognition | TFD five folds (test) | Average Accuracy88.9 | 8 |