Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

TrouSPI-Net: Spatio-temporal attention on parallel atrous convolutions and U-GRUs for skeletal pedestrian crossing prediction

About

Understanding the behaviors and intentions of pedestrians is still one of the main challenges for vehicle autonomy, as accurate predictions of their intentions can guarantee their safety and driving comfort of vehicles. In this paper, we address pedestrian crossing prediction in urban traffic environments by linking the dynamics of a pedestrian's skeleton to a binary crossing intention. We introduce TrouSPI-Net: a context-free, lightweight, multi-branch predictor. TrouSPI-Net extracts spatio-temporal features for different time resolutions by encoding pseudo-images sequences of skeletal joints' positions and processes them with parallel attention modules and atrous convolutions. The proposed approach is then enhanced by processing features such as relative distances of skeletal joints, bounding box positions, or ego-vehicle speed with U-GRUs. Using the newly proposed evaluation procedures for two large public naturalistic data sets for studying pedestrian behavior in traffic: JAAD and PIE, we evaluate TrouSPI-Net and analyze its performance. Experimental results show that TrouSPI-Net achieved 0.76 F1 score on JAAD and 0.80 F1 score on PIE, therefore outperforming current state-of-the-art while being lightweight and context-free.

Joseph Gesnouin, Steve Pechberti, Bogdan Stanciulescu, Fabien Moutarde• 2021

Related benchmarks

TaskDatasetResultRank
Pedestrian Intention PredictionJAAD (All)
Accuracy85
37
Pedestrian crossing intention predictionPIE set03 (test)
Accuracy88
16
Pedestrian crossing intention predictionJAADbeh (test)
Accuracy64
15
Showing 3 of 3 rows

Other info

Follow for update