Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Beyond Scanpaths: Graph-Based Gaze Simulation in Dynamic Scenes

About

Accurately modelling human attention is essential for numerous computer vision applications, particularly in the domain of automotive safety. Existing methods typically collapse gaze into saliency maps or scanpaths, treating gaze dynamics only implicitly. We instead formulate gaze modelling as an autoregressive dynamical system and explicitly unroll raw gaze trajectories over time, conditioned on both gaze history and the evolving environment. Driving scenes are represented as gaze-centric graphs processed by the Affinity Relation Transformer (ART), a heterogeneous graph transformer that models interactions between driver gaze, traffic objects, and road structure. We further introduce the Object Density Network (ODN) to predict next-step gaze distributions, capturing the stochastic and object-centric nature of attentional shifts in complex environments. We also release Focus100, a new dataset of raw gaze data from 30 participants viewing egocentric driving footage. Trained directly on raw gaze, without fixation filtering, our unified approach produces more natural gaze trajectories, scanpath dynamics, and saliency maps than existing attention models, offering valuable insights for the temporal modelling of human attention in dynamic environments.

Luke Palmer, Petar Palasek, Hazem Abdelkawy• 2026

Related benchmarks

TaskDatasetResultRank
Gaze Sequence and Dynamics PredictionMAAD
TC0.46
9
Gaze Sequence and Dynamics PredictionFocus100
TC0.22
9
Gaze Saliency Map EstimationFocus100
NSS4.864
8
Gaze Saliency Map EstimationMAAD
NSS4.926
8
Showing 4 of 4 rows

Other info

Follow for update