Return of Frustratingly Easy Unsupervised Video Domain Adaptation

About

Unsupervised video domain adaptation (UVDA) is a practical but under-explored problem. In this paper, we propose a frustratingly easy UVDA method, called MetaTrans. Specifically, MetaTrans adopts a concise learning objective that contains only two fundamental loss terms. Despite the simplicity of the learning objective, MetaTrans embodies an advanced UVDA idea, that is, handling the spatial and temporal divergence of cross-domain videos separately, through a subtle model architecture design. By implementing a temporal-static subtraction module, MetaTrans effectively removes spatial and temporal divergence. Extensive empirical evaluations, particularly on various cross-domain action recognition tasks, show substantial absolute adaptation performance enhancement and significantly superior relative performance gain compared with state-of-the-art UVDA baselines.

Pengfei Wei, Yiqun Sun, Zhiqiang Xu, Yiping Ke, Lawrence B. Hsieh• 2026

Related benchmarks

Task	Dataset	Result	Rank
Unsupervised Domain Adaptation	UCF-HMDB	Accuracy (U -> H)92.2		52
Action Recognition	Epic-Kitchens	Average Comparison Score51		47

Showing 2 of 2 rows

Other info

Follow for update

@wizwand_team Discord