Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Exploring Object-Centric Temporal Modeling for Efficient Multi-View 3D Object Detection

About

In this paper, we propose a long-sequence modeling framework, named StreamPETR, for multi-view 3D object detection. Built upon the sparse query design in the PETR series, we systematically develop an object-centric temporal mechanism. The model is performed in an online manner and the long-term historical information is propagated through object queries frame by frame. Besides, we introduce a motion-aware layer normalization to model the movement of the objects. StreamPETR achieves significant performance improvements only with negligible computation cost, compared to the single-frame baseline. On the standard nuScenes benchmark, it is the first online multi-view method that achieves comparable performance (67.6% NDS & 65.3% AMOTA) with lidar-based methods. The lightweight version realizes 45.0% mAP and 31.7 FPS, outperforming the state-of-the-art method (SOLOFusion) by 2.3% mAP and 1.8x faster FPS. Code has been available at https://github.com/exiawsh/StreamPETR.git.

Shihao Wang, Yingfei Liu, Tiancai Wang, Ying Li, Xiangyu Zhang• 2023

Related benchmarks

TaskDatasetResultRank
3D Object DetectionnuScenes (val)
NDS59.2
941
3D Object DetectionnuScenes (test)
mAP62
829
3D Object DetectionNuScenes v1.0 (test)
mAP62
210
3D Object DetectionnuScenes v1.0 (val)
mAP (Overall)50.4
190
3D Object DetectionWaymo Open Dataset (val)--
175
3D Multi-Object TrackingnuScenes (test)
ID Switches1.04e+3
130
3D Object DetectionnuScenes v1.0-trainval (val)
NDS55
87
3D Object DetectionArgoverse 2 (val)
mAP20.3
62
3D Object DetectionWaymo (val)--
38
3D Object TrackingnuScenes (test)
AMOTA65.3
28
Showing 10 of 12 rows

Other info

Code

Follow for update