Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Robust Visual Tracking by Segmentation

About

Estimating the target extent poses a fundamental challenge in visual object tracking. Typically, trackers are box-centric and fully rely on a bounding box to define the target in the scene. In practice, objects often have complex shapes and are not aligned with the image axis. In these cases, bounding boxes do not provide an accurate description of the target and often contain a majority of background pixels. We propose a segmentation-centric tracking pipeline that not only produces a highly accurate segmentation mask, but also internally works with segmentation masks instead of bounding boxes. Thus, our tracker is able to better learn a target representation that clearly differentiates the target in the scene from background content. In order to achieve the necessary robustness for the challenging tracking scenario, we propose a separate instance localization component that is used to condition the segmentation decoder when producing the output mask. We infer a bounding box from the segmentation mask, validate our tracker on challenging tracking datasets and achieve the new state of the art on LaSOT with a success AUC score of 69.7%. Since most tracking datasets do not contain mask annotations, we cannot use them to evaluate predicted segmentation masks. Instead, we validate our segmentation quality on two popular video object segmentation datasets.

Matthieu Paul, Martin Danelljan, Christoph Mayer, Luc Van Gool• 2022

Related benchmarks

TaskDatasetResultRank
Video Object SegmentationDAVIS 2017 (val)
J mean77.9
1130
Visual Object TrackingTrackingNet (test)
Normalized Precision (Pnorm)86
460
Visual Object TrackingLaSOT (test)
AUC69.7
444
Video Object SegmentationYouTube-VOS 2019 (val)
J-Score (Seen)77.9
231
Object TrackingTrackingNet
Precision (P)79.4
225
Visual Object TrackingNfS
AUC0.654
112
Object TrackingCOESOT (test)
SR56.1
50
Single Object TrackingCOESOT (test)
SR56.1
47
Visual TrackingUAV123
AUC67.6
41
Visual Object TrackingAVisT (test)
AUC50.8
35
Showing 10 of 16 rows

Other info

Code

Follow for update