UniGaze: Towards Universal Gaze Estimation via Large-scale Pre-Training

About

Despite decades of research on data collection and model architectures, current gaze estimation models encounter significant challenges in generalizing across diverse data domains. Recent advances in self-supervised pre-training have shown remarkable performances in generalization across various vision tasks. However, their effectiveness in gaze estimation remains unexplored. We propose UniGaze, for the first time, leveraging large-scale in-the-wild facial datasets for gaze estimation through self-supervised pre-training. Through systematic investigation, we clarify critical factors that are essential for effective pretraining in gaze estimation. Our experiments reveal that self-supervised approaches designed for semantic tasks fail when applied to gaze estimation, while our carefully designed pre-training pipeline consistently improves cross-domain performance. Through comprehensive experiments of challenging cross-dataset evaluation and novel protocols including leave-one-dataset-out and joint-dataset settings, we demonstrate that UniGaze significantly improves generalization across multiple data domains while minimizing reliance on costly labeled data. source code and model are available at https://github.com/ut-vision/UniGaze.

Jiawei Qin, Xucong Zhang, Yusuke Sugano• 2025

Related benchmarks

Task	Dataset	Result
Gaze Estimation	EyeDiap DE (test)	Angular Error4.53	45
Gaze Estimation	MPIIFaceGaze DM	Euclidean Distance (\|\|d\|\|2)52.8	25
Gaze Estimation	MPIIFaceGaze DM (test)	Angular Error (d)5.08	12
Gaze Estimation	EyeDiap cross-domain DE (test)	Angular Error d6.11	7
Gaze Estimation	RealGaze Overall	dX Error18.3	6
Gaze Estimation	RealGaze Ideal	dX Error (pixels)13.9	6
Gaze Estimation	RealGaze Side-Lit	dX Error14.6	6
Gaze Estimation	RealGaze Masks	dX Error16	6
Gaze Estimation	RealGaze Glasses	dX Error25.1	6

Showing 9 of 9 rows

Other info

Follow for update

@wizwand_team Discord