Exploiting temporal and depth information for multi-frame face anti-spoofing
About
Face anti-spoofing is significant to the security of face recognition systems. Previous works on depth supervised learning have proved the effectiveness for face anti-spoofing. Nevertheless, they only considered the depth as an auxiliary supervision in the single frame. Different from these methods, we develop a new method to estimate depth information from multiple RGB frames and propose a depth-supervised architecture which can efficiently encodes spatiotemporal information for presentation attack detection. It includes two novel modules: optical flow guided feature block (OFFB) and convolution gated recurrent units (ConvGRU) module, which are designed to extract short-term and long-term motion to discriminate living and spoofing faces. Extensive experiments demonstrate that the proposed approach achieves state-of-the-art results on four benchmark datasets, namely OULU-NPU, SiW, CASIA-MFSD, and Replay-Attack.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Face Anti-Spoofing | OULU-NPU (Protocol 1) | ACER (%)1.3 | 24 | |
| Face Anti-Spoofing | CASIA-MFSD RC Protocol (Train on Replay-Attack) (test) | HTER (%)24 | 11 | |
| Face Anti-Spoofing | Replay-Attack CR Protocol (Train on CASIA-MFSD) (test) | HTER17.5 | 11 | |
| Face Anti-Spoofing | OULU-NPU (Protocol 2) | APCER (%)1.7 | 8 | |
| Face Anti-Spoofing | SiW Protocol 1 (test) | ACER0.73 | 5 |