Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Space-Time Correspondence as a Contrastive Random Walk

About

This paper proposes a simple self-supervised approach for learning a representation for visual correspondence from raw video. We cast correspondence as prediction of links in a space-time graph constructed from video. In this graph, the nodes are patches sampled from each frame, and nodes adjacent in time can share a directed edge. We learn a representation in which pairwise similarity defines transition probability of a random walk, so that long-range correspondence is computed as a walk along the graph. We optimize the representation to place high probability along paths of similarity. Targets for learning are formed without supervision, by cycle-consistency: the objective is to maximize the likelihood of returning to the initial node when walking along a graph constructed from a palindrome of frames. Thus, a single path-level constraint implicitly supervises chains of intermediate comparisons. When used as a similarity metric without adaptation, the learned representation outperforms the self-supervised state-of-the-art on label propagation tasks involving objects, semantic parts, and pose. Moreover, we demonstrate that a technique we call edge dropout, as well as self-supervised adaptation at test-time, further improve transfer for object-centric correspondence.

Allan Jabri, Andrew Owens, Alexei A. Efros• 2020

Related benchmarks

TaskDatasetResultRank
Video Object SegmentationDAVIS 2017 (val)
J mean64.8
1130
Video Object SegmentationYouTube-VOS 2018 (val)
J Score (Seen)68.7
493
Video Object SegmentationDAVIS 2017 (test)
J (Jaccard Index)72.9
107
Medical Image SegmentationCVC-ClinicDB
Dice Score82.92
68
Video Object SegmentationDAVIS 2017
Jaccard Index (J)72.9
42
Pose PropagationJHMDB
PCK@0.159.3
20
Video label propagationJHMDB (val)
PCK@0.158.8
17
Human Pose TrackingJHMDB (val)
PCK@.159.3
15
Video Object SegmentationVOST 1.0 (test)
J_tr13.9
13
Human Part PropagationVIP (val)
mIoU38.6
12
Showing 10 of 16 rows

Other info

Follow for update