ADCrowdNet: An Attention-injective Deformable Convolutional Network for Crowd Understanding
About
We propose an attention-injective deformable convolutional network called ADCrowdNet for crowd understanding that can address the accuracy degradation problem of highly congested noisy scenes. ADCrowdNet contains two concatenated networks. An attention-aware network called Attention Map Generator (AMG) first detects crowd regions in images and computes the congestion degree of these regions. Based on detected crowd regions and congestion priors, a multi-scale deformable network called Density Map Estimator (DME) then generates high-quality density maps. With the attention-aware training scheme and multi-scale deformable convolutional scheme, the proposed ADCrowdNet achieves the capability of being more effective to capture the crowd features and more resistant to various noises. We have evaluated our method on four popular crowd counting datasets (ShanghaiTech, UCF_CC_50, WorldEXPO'10, and UCSD) and an extra vehicle counting dataset TRANCOS, and our approach beats existing state-of-the-art approaches on all of these datasets.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Crowd Counting | ShanghaiTech Part B | MAE7.6 | 160 | |
| Crowd Counting | ShanghaiTech Part A | MAE63.2 | 138 | |
| Crowd Counting | WorldExpo'10 (test) | Scene 1 Error1.6 | 80 | |
| Crowd Counting | UCF_CC_50 | MAE257.1 | 60 | |
| Crowd Counting | UCSD crowd-counting (test) | MAE0.98 | 36 | |
| Vehicle Counting | TRANCOS | GAME0 Error2.39 | 7 | |
| Vehicle Counting | TRANCOS 34 | MAE2.44 | 6 | |
| Density Map Estimation | ShanghaiTech Part_A | PSNR24.48 | 2 | |
| Density Map Estimation | ShanghaiTech Part B | PSNR29.35 | 2 | |
| Density Map Estimation | UCF_CC_50 | PSNR20.08 | 2 |