Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Object State Change Classification in Egocentric Videos using the Divided Space-Time Attention Mechanism

About

This report describes our submission called "TarHeels" for the Ego4D: Object State Change Classification Challenge. We use a transformer-based video recognition model and leverage the Divided Space-Time Attention mechanism for classifying object state change in egocentric videos. Our submission achieves the second-best performance in the challenge. Furthermore, we perform an ablation study to show that identifying object state change in egocentric videos requires temporal modeling ability. Lastly, we present several positive and negative examples to visualize our model's predictions. The code is publicly available at: https://github.com/md-mohaiminul/ObjectStateChange

Md Mohaiminul Islam, Gedas Bertasius• 2022

Related benchmarks

TaskDatasetResultRank
Object State Change Classification (OSCC)Ego4D (test)
Accuracy72
13
Object State Change ClassificationEgo4D (val)
Accuracy70.8
12
Showing 2 of 2 rows

Other info

Follow for update