Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

UniGaze: Towards Universal Gaze Estimation via Large-scale Pre-Training

About

Despite decades of research on data collection and model architectures, current gaze estimation models encounter significant challenges in generalizing across diverse data domains. Recent advances in self-supervised pre-training have shown remarkable performances in generalization across various vision tasks. However, their effectiveness in gaze estimation remains unexplored. We propose UniGaze, for the first time, leveraging large-scale in-the-wild facial datasets for gaze estimation through self-supervised pre-training. Through systematic investigation, we clarify critical factors that are essential for effective pretraining in gaze estimation. Our experiments reveal that self-supervised approaches designed for semantic tasks fail when applied to gaze estimation, while our carefully designed pre-training pipeline consistently improves cross-domain performance. Through comprehensive experiments of challenging cross-dataset evaluation and novel protocols including leave-one-dataset-out and joint-dataset settings, we demonstrate that UniGaze significantly improves generalization across multiple data domains while minimizing reliance on costly labeled data. source code and model are available at https://github.com/ut-vision/UniGaze.

Jiawei Qin, Xucong Zhang, Yusuke Sugano• 2025

Related benchmarks

TaskDatasetResultRank
Gaze EstimationMPIIFaceGaze DM
Euclidean Distance (||d||2)52.8
25
Gaze EstimationMPIIFaceGaze DM (test)
Angular Error (d)5.08
12
Gaze EstimationEyeDiap cross-domain DE (test)
Angular Error d6.11
7
Gaze EstimationRealGaze Overall
dX Error18.3
6
Gaze EstimationRealGaze Ideal
dX Error (pixels)13.9
6
Gaze EstimationRealGaze Side-Lit
dX Error14.6
6
Gaze EstimationRealGaze Masks
dX Error16
6
Gaze EstimationRealGaze Glasses
dX Error25.1
6
Gaze EstimationEyeDiap DE (test)
Angular Error4.53
5
Showing 9 of 9 rows

Other info

Follow for update