PETR: Position Embedding Transformation for Multi-View 3D Object Detection

About

In this paper, we develop position embedding transformation (PETR) for multi-view 3D object detection. PETR encodes the position information of 3D coordinates into image features, producing the 3D position-aware features. Object query can perceive the 3D position-aware features and perform end-to-end object detection. PETR achieves state-of-the-art performance (50.4% NDS and 44.1% mAP) on standard nuScenes dataset and ranks 1st place on the benchmark. It can serve as a simple yet strong baseline for future research. Code is available at \url{https://github.com/megvii-research/PETR}.

Yingfei Liu, Tiancai Wang, Xiangyu Zhang, Jian Sun• 2022

Related benchmarks

Task	Dataset	Result
3D Object Detection	nuScenes (val)	NDS49.6	981
3D Object Detection	nuScenes (test)	mAP44.5	903
3D Object Detection	NuScenes v1.0 (test)	mAP44.5	230
3D Object Detection	Waymo Open Dataset (val)	--	219
3D Object Detection	nuScenes (val)	NDS44.2	217
3D Object Detection	nuScenes v1.0 (val)	mAP (Overall)40.3	207
3D Object Detection	Argoverse 2 (val)	mAP17.6	101
3D Object Detection	Waymo Open Dataset LEVEL_1 (val)	3D AP20.9	60
Object Detection	nuScenes (val)	mAP37	48
3D Object Detection	nuScenes LiDAR Beamsreduce	NDS0.3521	41

Showing 10 of 30 rows

Other info

Code

Follow for update

@wizwand_team Discord