Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Improving Spatiotemporal Self-Supervision by Deep Reinforcement Learning

About

Self-supervised learning of convolutional neural networks can harness large amounts of cheap unlabeled data to train powerful feature representations. As surrogate task, we jointly address ordering of visual data in the spatial and temporal domain. The permutations of training samples, which are at the core of self-supervision by ordering, have so far been sampled randomly from a fixed preselected set. Based on deep reinforcement learning we propose a sampling policy that adapts to the state of the network, which is being trained. Therefore, new permutations are sampled according to their expected utility for updating the convolutional feature representation. Experimental evaluation on unsupervised and transfer learning tasks demonstrates competitive performance on standard benchmarks for image and video classification and nearest neighbor retrieval.

Uta B\"uchler, Biagio Brattoli, Bj\"orn Ommer• 2018

Related benchmarks

TaskDatasetResultRank
Action RecognitionUCF101 (mean of 3 splits)
Accuracy58.6
357
Action RecognitionUCF101 (test)
Accuracy58.6
307
Action RecognitionHMDB51 (test)
Accuracy0.25
249
Action RecognitionHMDB-51 (average of three splits)
Top-1 Acc25
204
Action ClassificationHMDB51 (over all three splits)
Accuracy25
121
Video RetrievalUCF101 (1)
Top-1 Acc25.7
92
Video RetrievalUCF101
Top-1 Acc25.7
63
Video RetrievalUCF101 (test)
Top-1 Acc25.7
55
Showing 8 of 8 rows

Other info

Follow for update