Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes

About

We propose a network for Congested Scene Recognition called CSRNet to provide a data-driven and deep learning method that can understand highly congested scenes and perform accurate count estimation as well as present high-quality density maps. The proposed CSRNet is composed of two major components: a convolutional neural network (CNN) as the front-end for 2D feature extraction and a dilated CNN for the back-end, which uses dilated kernels to deliver larger reception fields and to replace pooling operations. CSRNet is an easy-trained model because of its pure convolutional structure. We demonstrate CSRNet on four datasets (ShanghaiTech dataset, the UCF_CC_50 dataset, the WorldEXPO'10 dataset, and the UCSD dataset) and we deliver the state-of-the-art performance. In the ShanghaiTech Part_B dataset, CSRNet achieves 47.3% lower Mean Absolute Error (MAE) than the previous state-of-the-art method. We extend the targeted applications for counting other objects, such as the vehicle in TRANCOS dataset. Results show that CSRNet significantly improves the output quality with 15.4% lower MAE than the previous state-of-the-art approach.

Yuhong Li, Xiaofan Zhang, Deming Chen• 2018

Related benchmarks

TaskDatasetResultRank
Crowd CountingShanghaiTech Part A (test)
MAE68.2
227
Crowd CountingShanghaiTech Part B (test)
MAE10.6
191
Crowd CountingShanghaiTech Part B
MAE10.6
160
Crowd CountingShanghaiTech Part A
MAE68.2
138
Crowd CountingUCF-QNRF (test)
MAE110.6
95
Crowd CountingWorldExpo'10 (test)
Scene 1 Error2.9
80
Crowd CountingUCF_CC_50 (test)
MAE266.1
66
Face DetectionWIDER FACE (val)--
62
Crowd CountingUCF_CC_50
MAE266.1
60
Crowd CountingUCF_CC_50 (5-fold cross-validation)
MAE266.1
43
Showing 10 of 46 rows

Other info

Follow for update