Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Memory-and-Anticipation Transformer for Online Action Understanding

About

Most existing forecasting systems are memory-based methods, which attempt to mimic human forecasting ability by employing various memory mechanisms and have progressed in temporal modeling for memory dependency. Nevertheless, an obvious weakness of this paradigm is that it can only model limited historical dependence and can not transcend the past. In this paper, we rethink the temporal dependence of event evolution and propose a novel memory-anticipation-based paradigm to model an entire temporal structure, including the past, present, and future. Based on this idea, we present Memory-and-Anticipation Transformer (MAT), a memory-anticipation-based approach, to address the online action detection and anticipation tasks. In addition, owing to the inherent superiority of MAT, it can process online action detection and anticipation tasks in a unified manner. The proposed MAT model is tested on four challenging benchmarks TVSeries, THUMOS'14, HDD, and EPIC-Kitchens-100, for online action detection and anticipation tasks, and it significantly outperforms all existing methods. Code is available at https://github.com/Echo0125/Memory-and-Anticipation-Transformer.

Jiahao Wang, Guo Chen, Yifei Huang, Limin Wang, Tong Lu• 2023

Related benchmarks

TaskDatasetResultRank
Online Action DetectionTHUMOS14 (test)
mAP71.7
86
Action AnticipationEPIC-KITCHENS 100 (test)
Overall Action Top-5 Recall19.5
59
Online Action DetectionTVSeries
mcAP89.7
57
Online Action DetectionTVSeries (test)
mcAP89.7
41
Online Action DetectionTHUMOS 14
Mean F-AP71.6
37
Online Action DetectionHDD
Overall mAP32.7
29
Action AnticipationTVSeries (test)
mcAP82.6
22
Action AnticipationEPIC-Kitchens-100 Unseen
Verb Recall@532.5
15
Action AnticipationTHUMOS 2014
mAP (Avg)58.2
14
Action AnticipationTHUMOS-14 (test)--
14
Showing 10 of 18 rows

Other info

Code

Follow for update