Event-Based Visual Teach-and-Repeat via Fast Fourier-Domain Cross-Correlation

About

Visual teach-and-repeat (VT&R) navigation enables robots to autonomously traverse previously demonstrated paths using visual feedback. We present a novel event-camera-based VT\&R system. Our system formulates event-stream matching as frequency-domain cross-correlation, transforming spatial convolutions into efficient Fourier-space multiplications. By exploiting the binary structure of event frames and applying image compression techniques, we achieve a processing latency of just 2.88 ms, about 3.5 times faster than conventional camera-based baselines that are optimised for runtime efficiency. Experiments using a Prophesee EVK4 HD event camera mounted on an AgileX Scout Mini robot demonstrate successful autonomous navigation across 3000+ meters of indoor and outdoor trajectories in daytime and nighttime conditions. Our system maintains Cross-Track Errors (XTE) below 15 cm, demonstrating the practical viability of event-based perception for real-time VT\&R navigation.

Gokul B. Nair, Alejandro Fontan, Michael Milford, Tobias Fischer• 2025

Related benchmarks

Task	Dataset	Result
Visual Teach and Repeat Navigation	Track 5 Outdoor	Cross-Track Error (XTE)5.78	10
Visual Teach and Repeat Navigation	Track 6 Outdoor, Night-time	Cross-Track Error (XTE)5.22	10
Visual Teach and Repeat Navigation	Track 4 Outdoor	Cross-Track Error (XTE)5.27	10
Visual Teach and Repeat Navigation	Track 1 Indoor	Cross-Track Error (XTE)5.68	10
Visual Teach and Repeat Navigation	Track 2 Indoor	Cross-Track Error (XTE)7.68	10
Visual Teach and Repeat Navigation	Track 3 Indoor	Cross-Track Error (XTE)8.57	10

Showing 6 of 6 rows

Other info

Follow for update

@wizwand_team Discord