NeuralNetwork-Viterbi: A Framework for Weakly Supervised Video Learning

About

Video learning is an important task in computer vision and has experienced increasing interest over the recent years. Since even a small amount of videos easily comprises several million frames, methods that do not rely on a frame-level annotation are of special importance. In this work, we propose a novel learning algorithm with a Viterbi-based loss that allows for online and incremental learning of weakly annotated video data. We moreover show that explicit context and length modeling leads to huge improvements in video segmentation and labeling tasks andinclude these models into our framework. On several action segmentation benchmarks, we obtain an improvement of up to 10% compared to current state-of-the-art methods.

Alexander Richard, Hilde Kuehne, Ahsan Iqbal, Juergen Gall• 2018

Related benchmarks

Task	Dataset	Result
Temporal action segmentation	Breakfast	Accuracy74.1	119
Temporal action segmentation	50Salads	Accuracy78.7	117
Action Segmentation	Breakfast	MoF43	78
Action Segmentation	Breakfast (test)	MoF43	31
Action Segmentation	COIN	Frame Accuracy21.2	29
Action Segmentation	Breakfast 14	MoF43	26
Action Segmentation	COIN (test)	Frame Accuracy21.2	23
Action Segmentation	Breakfast Action dataset	MoF43	22
Action Segmentation	50Salads mid granularity	MoF49.4	19
Action Segmentation	50Salads (test)	--	16

Showing 10 of 23 rows

Other info

Code

Follow for update

@wizwand_team Discord