A Lightweight Spatiotemporal Network for Online Eye Tracking with Event Camera

About

Event-based data are commonly encountered in edge computing environments where efficiency and low latency are critical. To interface with such data and leverage their rich temporal features, we propose a causal spatiotemporal convolutional network. This solution targets efficient implementation on edge-appropriate hardware with limited resources in three ways: 1) deliberately targets a simple architecture and set of operations (convolutions, ReLU activations) 2) can be configured to perform online inference efficiently via buffering of layer outputs 3) can achieve more than 90% activation sparsity through regularization during training, enabling very significant efficiency gains on event-based processors. In addition, we propose a general affine augmentation strategy acting directly on the events, which alleviates the problem of dataset scarcity for event-based systems. We apply our model on the AIS 2024 event-based eye tracking challenge, reaching a score of 0.9916 p10 accuracy on the Kaggle private testset.

Yan Ru Pei, Sasskia Br\"uers, S\'ebastien Crouzet, Douglas McLelland, Olivier Coenen• 2024

Related benchmarks

Task	Dataset	Result	Rank
Eye Tracking	3ET+ CVPR AIS Challenge 2024	P10 Error99		20
Hand Gesture Recognition	DVS128 10-class (test)	Accuracy99.17		11

Showing 2 of 2 rows

Other info

Follow for update

@wizwand_team Discord