Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Mistake Attribution: Fine-Grained Mistake Understanding in Egocentric Videos

About

We introduce Mistake Attribution (MATT), a new task for fine-grained understanding of human mistakes in egocentric videos. While prior work detects whether a mistake occurs, MATT attributes the mistake to what part of the instruction is violated (semantic role), when in the video the deviation becomes irreversible (the Point-of-No-Return, PNR), and where the mistake appears in the PNR frame. We develop MisEngine, a data engine that automatically constructs mistake samples from existing datasets with attribution-rich annotations. Applied to large egocentric corpora, MisEngine yields EPIC-KITCHENS-M and Ego4D-M -- two datasets up to two orders of magnitude larger than prior mistake datasets. We then present MisFormer, a unified attention-based model for mistake attribution across semantic, temporal, and spatial dimensions, trained with MisEngine supervision. A human study demonstrates the ecological validity of our MisEngine-constructed mistake samples, confirming that EPIC-KITCHENS-M and Ego4D-M can serve as reliable benchmarks for mistake understanding. Experiments on both our datasets and prior benchmarks show that MisFormer, as a single unified model, outperforms task-specific SOTA methods by at least 6.66%, 21.81%, 18.7%, and 3.00% in video-language understanding, temporal localization, hand-object interaction, and mistake detection, respectively. Project page: https://yayuanli.github.io/MATT/

Yayuan Li, Aadit Jain, Filippos Bellos, Jason J. Corso• 2025

Related benchmarks

TaskDatasetResultRank
Semantic AttributionEPIC-KITCHENS M (test)
Average Accuracy84.91
5
Semantic AttributionEgo4D-M (test)
Average Accuracy62.03
5
Mistake detectionEgoPER
F1@.535.18
4
Mistake detectionEPIC-KITCHENS-M (EK) (test)
F1@0.578.05
3
Mistake detectionEgo4D-M (test)
F1@.557.55
3
Temporal AttributionEgo4D-M (test)
MAE (frames)19.14
3
Spatial AttributionEgo4D-M
mIoU59.21
3
Showing 7 of 7 rows

Other info

Follow for update