Deep Occlusion Reasoning for Multi-Camera Multi-Target Detection
About
People detection in single 2D images has improved greatly in recent years. However, comparatively little of this progress has percolated into multi-camera multi-people tracking algorithms, whose performance still degrades severely when scenes become very crowded. In this work, we introduce a new architecture that combines Convolutional Neural Nets and Conditional Random Fields to explicitly model those ambiguities. One of its key ingredients are high-order CRF terms that model potential occlusions and give our approach its robustness even when many people are present. Our model is trained end-to-end and we show that it outperforms several state-of-art algorithms on challenging scenes.
Pierre Baqu\'e, Fran\c{c}ois Fleuret, Pascal Fua• 2017
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Multiview Pedestrian Detection | WILDTRACK (test) | MODA74.1 | 46 | |
| Multiview Pedestrian Detection | MultiviewX (test) | MODA75.2 | 35 | |
| Pedestrian Detection | Wildtrack | MODA74.1 | 21 | |
| Pedestrian Detection | MultiviewX | MODA75.2 | 21 | |
| Multi-View Detection | Wildtrack | MODA74.1 | 12 | |
| Multi-view people detection | MultiviewX | MODA75.2 | 10 |
Showing 6 of 6 rows