Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Memory-and-Anticipation Transformer for Online Action Understanding

About

Most existing forecasting systems are memory-based methods, which attempt to mimic human forecasting ability by employing various memory mechanisms and have progressed in temporal modeling for memory dependency. Nevertheless, an obvious weakness of this paradigm is that it can only model limited historical dependence and can not transcend the past. In this paper, we rethink the temporal dependence of event evolution and propose a novel memory-anticipation-based paradigm to model an entire temporal structure, including the past, present, and future. Based on this idea, we present Memory-and-Anticipation Transformer (MAT), a memory-anticipation-based approach, to address the online action detection and anticipation tasks. In addition, owing to the inherent superiority of MAT, it can process online action detection and anticipation tasks in a unified manner. The proposed MAT model is tested on four challenging benchmarks TVSeries, THUMOS'14, HDD, and EPIC-Kitchens-100, for online action detection and anticipation tasks, and it significantly outperforms all existing methods. Code is available at https://github.com/Echo0125/Memory-and-Anticipation-Transformer.

Jiahao Wang, Guo Chen, Yifei Huang, Limin Wang, Tong Lu• 2023

Related benchmarks

TaskDatasetResultRank
Online Action DetectionTHUMOS14 (test)
mAP71.7
93
Online Action DetectionTVSeries
mcAP89.7
71
Action AnticipationEPIC-KITCHENS 100 (test)
Overall Action Top-5 Recall19.5
70
Online Action DetectionTVSeries (test)
mcAP89.7
41
Online Action DetectionTHUMOS 14
Mean F-AP71.6
37
Online Action DetectionHDD
Overall mAP32.7
29
Action AnticipationTVSeries (test)
mcAP82.6
22
Online Action DetectionCrossTask
P-F134.2
20
Online Action DetectionEpic Kitchens 100
Segment F117.5
20
Online Action DetectionEgo4D GoalStep
Segment F19.5
20
Showing 10 of 20 rows

Other info

Code

Follow for update