Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

SparseAlign: A Fully Sparse Framework for Cooperative Object Detection

About

Cooperative perception can increase the view field and decrease the occlusion of an ego vehicle, hence improving the perception performance and safety of autonomous driving. Despite the success of previous works on cooperative object detection, they mostly operate on dense Bird's Eye View (BEV) feature maps, which are computationally demanding and can hardly be extended to long-range detection problems. More efficient fully sparse frameworks are rarely explored. In this work, we design a fully sparse framework, SparseAlign, with three key features: an enhanced sparse 3D backbone, a query-based temporal context learning module, and a robust detection head specially tailored for sparse features. Extensive experimental results on both OPV2V and DairV2X datasets show that our framework, despite its sparsity, outperforms the state of the art with less communication bandwidth requirements. In addition, experiments on the OPV2Vt and DairV2Xt datasets for time-aligned cooperative object detection also show a significant performance gain compared to the baseline works.

Yunshuang Yuan, Yan Xia, Daniel Cremers, Monika Sester• 2025

Related benchmarks

TaskDatasetResultRank
3D Object DetectionDAIR-V2X
AP@0.5084.5
57
3D Object DetectionOPV2V
AP@0.5092.2
47
Cooperative Object DetectionOPV2V
AP@0.593
18
Time-Aligned Cooperative Object DetectionOPV2Vt
AP@0.589.3
6
Time-Aligned Cooperative Object DetectionDairV2Xt
AP@0.579.6
6
3D Object DetectionOPV2Vt
AP@0.589.8
1
3D Object DetectionDairV2Xt
AP@0.569.8
1
Showing 7 of 7 rows

Other info

Follow for update