Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Gaze Estimation using Transformer

About

Recent work has proven the effectiveness of transformers in many computer vision tasks. However, the performance of transformers in gaze estimation is still unexplored. In this paper, we employ transformers and assess their effectiveness for gaze estimation. We consider two forms of vision transformer which are pure transformers and hybrid transformers. We first follow the popular ViT and employ a pure transformer to estimate gaze from images. On the other hand, we preserve the convolutional layers and integrate CNNs as well as transformers. The transformer serves as a component to complement CNNs. We compare the performance of the two transformers in gaze estimation. The Hybrid transformer significantly outperforms the pure transformer in all evaluation datasets with less parameters. We further conduct experiments to assess the effectiveness of the hybrid transformer and explore the advantage of self-attention mechanism. Experiments show the hybrid transformer can achieve state-of-the-art performance in all benchmarks with pre-training.To facilitate further research, we release codes and models in https://github.com/yihuacheng/GazeTR.

Yihua Cheng, Feng Lu• 2021

Related benchmarks

TaskDatasetResultRank
Gaze EstimationGaze360 (test)
MAE (All 360°)15.29
52
Gaze EstimationGFIE trained on Gaze360 (Backward)
Angular Error (degrees)42.84
24
3D Gaze EstimationGFIE (test)
MAE 3D21.12
23
Gaze EstimationGaze360
Angular Error10.62
16
Gaze EstimationMPIIFaceGaze
Angular Error (degrees)4
15
Gaze EstimationMPIIFaceGaze M (test)
Gaze Error (degrees)4.18
15
Gaze EstimationEYEDIAP (E) (test)
Mean Gaze Error (degrees)5.44
15
Gaze EstimationEYEDIAP
Angular Error5.17
12
Gaze EstimationGFIE trained on Gaze360 (Front)
Angular Error (degrees)30.83
12
Gaze EstimationGFIE Front facing trained on Gaze360
Angular Error35.58
12
Showing 10 of 19 rows

Other info

Follow for update