Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Exploring Object-Centric Temporal Modeling for Efficient Multi-View 3D Object Detection

About

In this paper, we propose a long-sequence modeling framework, named StreamPETR, for multi-view 3D object detection. Built upon the sparse query design in the PETR series, we systematically develop an object-centric temporal mechanism. The model is performed in an online manner and the long-term historical information is propagated through object queries frame by frame. Besides, we introduce a motion-aware layer normalization to model the movement of the objects. StreamPETR achieves significant performance improvements only with negligible computation cost, compared to the single-frame baseline. On the standard nuScenes benchmark, it is the first online multi-view method that achieves comparable performance (67.6% NDS & 65.3% AMOTA) with lidar-based methods. The lightweight version realizes 45.0% mAP and 31.7 FPS, outperforming the state-of-the-art method (SOLOFusion) by 2.3% mAP and 1.8x faster FPS. Code has been available at https://github.com/exiawsh/StreamPETR.git.

Shihao Wang, Yingfei Liu, Tiancai Wang, Ying Li, Xiangyu Zhang• 2023

Related benchmarks

TaskDatasetResultRank
3D Object DetectionnuScenes (val)
NDS59.2
981
3D Object DetectionnuScenes (test)
mAP62
903
3D Object DetectionNuScenes v1.0 (test)
mAP62
230
3D Object DetectionWaymo Open Dataset (val)--
219
3D Object DetectionnuScenes (val)
NDS63
217
3D Object DetectionnuScenes v1.0 (val)
mAP (Overall)50.4
207
3D Object DetectionnuScenes v1.0-trainval (val)
NDS59.2
182
3D Multi-Object TrackingnuScenes (test)
ID Switches1.04e+3
139
3D Object DetectionArgoverse 2 (val)
mAP20.3
101
3D Object DetectionWaymo (val)--
38
Showing 10 of 19 rows

Other info

Code

Follow for update