Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Exploring Object-Centric Temporal Modeling for Efficient Multi-View 3D Object Detection

About

In this paper, we propose a long-sequence modeling framework, named StreamPETR, for multi-view 3D object detection. Built upon the sparse query design in the PETR series, we systematically develop an object-centric temporal mechanism. The model is performed in an online manner and the long-term historical information is propagated through object queries frame by frame. Besides, we introduce a motion-aware layer normalization to model the movement of the objects. StreamPETR achieves significant performance improvements only with negligible computation cost, compared to the single-frame baseline. On the standard nuScenes benchmark, it is the first online multi-view method that achieves comparable performance (67.6% NDS & 65.3% AMOTA) with lidar-based methods. The lightweight version realizes 45.0% mAP and 31.7 FPS, outperforming the state-of-the-art method (SOLOFusion) by 2.3% mAP and 1.8x faster FPS. Code has been available at https://github.com/exiawsh/StreamPETR.git.

Shihao Wang, Yingfei Liu, Tiancai Wang, Ying Li, Xiangyu Zhang• 2023

Related benchmarks

TaskDatasetResultRank
3D Object DetectionnuScenes (val)
NDS59.2
981
3D Object DetectionnuScenes (test)
mAP62
874
3D Object DetectionNuScenes v1.0 (test)
mAP62
210
3D Object DetectionnuScenes v1.0 (val)
mAP (Overall)50.4
207
3D Object DetectionWaymo Open Dataset (val)--
200
3D Multi-Object TrackingnuScenes (test)
ID Switches1.04e+3
139
3D Object DetectionnuScenes (val)
mAP55.5
128
3D Object DetectionnuScenes v1.0-trainval (val)
NDS55
121
3D Object DetectionArgoverse 2 (val)
mAP20.3
76
3D Object DetectionWaymo (val)--
38
Showing 10 of 16 rows

Other info

Code

Follow for update