Neural Aggregation Network for Video Face Recognition
About
This paper presents a Neural Aggregation Network (NAN) for video face recognition. The network takes a face video or face image set of a person with a variable number of face images as its input, and produces a compact, fixed-dimension feature representation for recognition. The whole network is composed of two modules. The feature embedding module is a deep Convolutional Neural Network (CNN) which maps each face image to a feature vector. The aggregation module consists of two attention blocks which adaptively aggregate the feature vectors to form a single feature inside the convex hull spanned by them. Due to the attention mechanism, the aggregation is invariant to the image order. Our NAN is trained with a standard classification or verification loss without any extra supervision signal, and we found that it automatically learns to advocate high-quality face images while repelling low-quality ones such as blurred, occluded and improperly exposed faces. The experiments on IJB-A, YouTube Face, Celebrity-1000 video face recognition benchmarks show that it consistently outperforms naive aggregation methods and achieves the state-of-the-art accuracy.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Face Verification | IJB-A | TAR @ FAR=1%0.941 | 38 | |
| Face Verification | IJB-A (test) | TAR @ FAR=0.010.941 | 37 | |
| Face Identification | IJB-A (test) | Rank-195.8 | 30 | |
| Face Verification | YouTube Face (YTF) 40 (10-fold cross-validation) | Accuracy95.72 | 23 | |
| Face Identification | IJB-A 1:N Identification | Rank-195.8 | 19 | |
| Face Recognition | IJB-A (test) | TAR @ FAR=0.0194.1 | 16 | |
| Video-wise Identification | DroneSURF Active Surveillance 1.0 (test) | Rank-1 Acc80.21 | 14 | |
| Video-wise Identification | DroneSURF Passive Surveillance 1.0 (test) | Rank-1 Accuracy0.0833 | 14 | |
| Face Verification | BTS Face Included 3.1 (Treatment) | TAR @ FAR=1e-165.29 | 9 | |
| Face Verification | BTS Face Included 3.1 (Control) | TAR @ FAR=1e-196.06 | 9 |