Jointly Modeling Motion and Appearance Cues for Robust RGB-T Tracking

About

In this study, we propose a novel RGB-T tracking framework by jointly modeling both appearance and motion cues. First, to obtain a robust appearance model, we develop a novel late fusion method to infer the fusion weight maps of both RGB and thermal (T) modalities. The fusion weights are determined by using offline-trained global and local multimodal fusion networks, and then adopted to linearly combine the response maps of RGB and T modalities. Second, when the appearance cue is unreliable, we comprehensively take motion cues, i.e., target and camera motions, into account to make the tracker robust. We further propose a tracker switcher to switch the appearance and motion trackers flexibly. Numerous results on three recent RGB-T tracking datasets show that the proposed tracker performs significantly better than other state-of-the-art algorithms.

Pengyu Zhang, Jie Zhao, Dong Wang, Huchuan Lu, Xiaoyun Yang• 2020

Related benchmarks

Task	Dataset	Result
RGB-T Tracking	RGBT234 (test)	Precision Rate79	203
RGB-T Tracking	GTOT	PR90.2	138
RGB-T Tracking	RGBT234	Precision79	121
RGBT Tracking	RGBT234	PR79	112
RGBT Tracking	LasHeR	PR46.7	62
RGBT Tracking	RGBT 234	Precision Rate79	53
RGB-Thermal tracking	RGBT234 (test)	MSR57.3	41
RGBT Tracking	VOT-RGBT 2019	EAO49.8	40
RGB-T Tracking	RGBT210 (test)	PR78.3	32
RGB-T Tracking	GTOT (test)	PR90.2	19

Showing 10 of 11 rows

Other info

Follow for update

@wizwand_team Discord