Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Spatial-Temporal Relation Networks for Multi-Object Tracking

About

Recent progress in multiple object tracking (MOT) has shown that a robust similarity score is key to the success of trackers. A good similarity score is expected to reflect multiple cues, e.g. appearance, location, and topology, over a long period of time. However, these cues are heterogeneous, making them hard to be combined in a unified network. As a result, existing methods usually encode them in separate networks or require a complex training approach. In this paper, we present a unified framework for similarity measurement which could simultaneously encode various cues and perform reasoning across both spatial and temporal domains. We also study the feature representation of a tracklet-object pair in depth, showing a proper design of the pair features can well empower the trackers. The resulting approach is named spatial-temporal relation networks (STRN). It runs in a feed-forward way and can be trained in an end-to-end manner. The state-of-the-art accuracy was achieved on all of the MOT15-17 benchmarks using public detection and online settings.

Jiarui Xu, Yue Cao, Zheng Zhang, Han Hu• 2019

Related benchmarks

TaskDatasetResultRank
Multiple Object TrackingMOT17 (test)
MOTA50.9
921
Multi-Object TrackingMOT16 (test)
MOTA48.5
228
Multi-Object TrackingMOT17 1.0 (test)
MOTA50.9
48
Multiple Object Tracking2D MOT15 (test)
MOTA38.1
34
Showing 4 of 4 rows

Other info

Follow for update