YOLOv4: Optimal Speed and Accuracy of Object Detection

About

There are a huge number of features which are said to improve Convolutional Neural Network (CNN) accuracy. Practical testing of combinations of such features on large datasets, and theoretical justification of the result, is required. Some features operate on certain models exclusively and for certain problems exclusively, or only for small-scale datasets; while some features, such as batch-normalization and residual-connections, are applicable to the majority of models, tasks, and datasets. We assume that such universal features include Weighted-Residual-Connections (WRC), Cross-Stage-Partial-connections (CSP), Cross mini-Batch Normalization (CmBN), Self-adversarial-training (SAT) and Mish-activation. We use new features: WRC, CSP, CmBN, SAT, Mish activation, Mosaic data augmentation, CmBN, DropBlock regularization, and CIoU loss, and combine some of them to achieve state-of-the-art results: 43.5% AP (65.7% AP50) for the MS COCO dataset at a realtime speed of ~65 FPS on Tesla V100. Source code is at https://github.com/AlexeyAB/darknet

Alexey Bochkovskiy, Chien-Yao Wang, Hong-Yuan Mark Liao• 2020

Related benchmarks

Task	Dataset	Result
Object Detection	COCO 2017 (val)	AP21.7	2843
Object Detection	COCO (test-dev)	mAP55.8	1239
Object Detection	MS COCO (test-dev)	mAP@.565.7	677
Object Detection	COCO (val)	mAP43.5	637
Object Detection	COCO v2017 (test-dev)	mAP43.5	499
Instance Segmentation	COCO 2017 (test-dev)	--	253
Object Detection	VeDAI	mAP@0.562.55	38
Object Detection	VisDrone	mAP5030.7	36
Object Detection	SESYD Floorplans (test)	AP5093.04	21
Object Detection	PKU-DDD Car 17	mAP5081.3	20

Showing 10 of 20 rows

Other info

Code

Follow for update

@wizwand_team Discord