UniGaze: Towards Universal Gaze Estimation via Large-scale Pre-Training
About
Despite decades of research on data collection and model architectures, current gaze estimation models encounter significant challenges in generalizing across diverse data domains. Recent advances in self-supervised pre-training have shown remarkable performances in generalization across various vision tasks. However, their effectiveness in gaze estimation remains unexplored. We propose UniGaze, for the first time, leveraging large-scale in-the-wild facial datasets for gaze estimation through self-supervised pre-training. Through systematic investigation, we clarify critical factors that are essential for effective pretraining in gaze estimation. Our experiments reveal that self-supervised approaches designed for semantic tasks fail when applied to gaze estimation, while our carefully designed pre-training pipeline consistently improves cross-domain performance. Through comprehensive experiments of challenging cross-dataset evaluation and novel protocols including leave-one-dataset-out and joint-dataset settings, we demonstrate that UniGaze significantly improves generalization across multiple data domains while minimizing reliance on costly labeled data. source code and model are available at https://github.com/ut-vision/UniGaze.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Gaze Estimation | MPIIFaceGaze DM | Euclidean Distance (||d||2)52.8 | 25 | |
| Gaze Estimation | MPIIFaceGaze DM (test) | Angular Error (d)5.08 | 12 | |
| Gaze Estimation | EyeDiap cross-domain DE (test) | Angular Error d6.11 | 7 | |
| Gaze Estimation | RealGaze Overall | dX Error18.3 | 6 | |
| Gaze Estimation | RealGaze Ideal | dX Error (pixels)13.9 | 6 | |
| Gaze Estimation | RealGaze Side-Lit | dX Error14.6 | 6 | |
| Gaze Estimation | RealGaze Masks | dX Error16 | 6 | |
| Gaze Estimation | RealGaze Glasses | dX Error25.1 | 6 | |
| Gaze Estimation | EyeDiap DE (test) | Angular Error4.53 | 5 |