BMN: Boundary-Matching Network for Temporal Action Proposal Generation
About
Temporal action proposal generation is an challenging and promising task which aims to locate temporal regions in real-world videos where action or event may occur. Current bottom-up proposal generation methods can generate proposals with precise boundary, but cannot efficiently generate adequately reliable confidence scores for retrieving proposals. To address these difficulties, we introduce the Boundary-Matching (BM) mechanism to evaluate confidence scores of densely distributed proposals, which denote a proposal as a matching pair of starting and ending boundaries and combine all densely distributed BM pairs into the BM confidence map. Based on BM mechanism, we propose an effective, efficient and end-to-end proposal generation method, named Boundary-Matching Network (BMN), which generates proposals with precise temporal boundaries as well as reliable confidence scores simultaneously. The two-branches of BMN are jointly trained in an unified framework. We conduct experiments on two challenging datasets: THUMOS-14 and ActivityNet-1.3, where BMN shows significant performance improvement with remarkable efficiency and generalizability. Further, combining with existing action classifier, BMN can achieve state-of-the-art temporal action detection performance.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Temporal Action Detection | THUMOS-14 (test) | mAP@tIoU=0.539.8 | 330 | |
| Temporal Action Localization | THUMOS14 (test) | AP @ IoU=0.538.8 | 319 | |
| Temporal Action Localization | THUMOS-14 (test) | mAP@0.356 | 308 | |
| Temporal Action Localization | ActivityNet 1.3 (val) | AP@0.550.1 | 257 | |
| Temporal Action Detection | ActivityNet v1.3 (val) | mAP@0.552.24 | 185 | |
| Temporal Action Proposal | ActivityNet v1.3 (val) | AUC67.1 | 114 | |
| Temporal Action Localization | THUMOS 2014 | mAP@0.3056 | 93 | |
| Temporal Action Detection | ActivityNet 1.3 | mAP@0.551.23 | 93 | |
| Temporal Action Proposal Generation | THUMOS14 (test) | AR@5039.36 | 84 | |
| Temporal Action Detection | ActivityNet 1.3 (test) | Average mAP36.42 | 80 |