Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

LSTA: Long Short-Term Attention for Egocentric Action Recognition

About

Egocentric activity recognition is one of the most challenging tasks in video analysis. It requires a fine-grained discrimination of small objects and their manipulation. While some methods base on strong supervision and attention mechanisms, they are either annotation consuming or do not take spatio-temporal patterns into account. In this paper we propose LSTA as a mechanism to focus on features from spatial relevant parts while attention is being tracked smoothly across the video sequence. We demonstrate the effectiveness of LSTA on egocentric activity recognition with an end-to-end trainable two-stream architecture, achieving state of the art performance on four standard benchmarks.

Swathikiran Sudhakaran, Sergio Escalera, Oswald Lanz• 2018

Related benchmarks

TaskDatasetResultRank
Action RecognitionEPIC-Kitchens v1 (test s2 (unseen))
Actions Top-1 Acc16.63
32
Action RecognitionEPIC-Kitchens s1 (seen) v1 (test)
Actions Top-1 Accuracy30.2
29
Action RecognitionEGTEA Gaze+
Accuracy61.86
18
Action RecognitionEPIC-KITCHENS 1 (S1 Seen kitchens)
Top-1 Accuracy (Verb)59.55
17
Egocentric Action RecognitionEPIC-Kitchens test (S1)
Top-1 Acc (Verb)59.55
16
Egocentric Action RecognitionEPIC-KITCHENS S2 (test)
Top-1 Accuracy (Verb)47.32
16
Egocentric Activity RecognitionGTEA 61
Accuracy80.01
14
Egocentric Activity RecognitionGTEA 61 (fixed split)
Accuracy79.31
13
Egocentric Activity RecognitionGTEA 71
Accuracy78.14
13
Action RecognitionEPIC-KITCHENS S2 (test)
Top-1 Verb Accuracy47.32
11
Showing 10 of 14 rows

Other info

Code

Follow for update