Domain-Agnostic Mutual Prompting for Unsupervised Domain Adaptation

About

Conventional Unsupervised Domain Adaptation (UDA) strives to minimize distribution discrepancy between domains, which neglects to harness rich semantics from data and struggles to handle complex domain shifts. A promising technique is to leverage the knowledge of large-scale pre-trained vision-language models for more guided adaptation. Despite some endeavors, current methods often learn textual prompts to embed domain semantics for source and target domains separately and perform classification within each domain, limiting cross-domain knowledge transfer. Moreover, prompting only the language branch lacks flexibility to adapt both modalities dynamically. To bridge this gap, we propose Domain-Agnostic Mutual Prompting (DAMP) to exploit domain-invariant semantics by mutually aligning visual and textual embeddings. Specifically, the image contextual information is utilized to prompt the language branch in a domain-agnostic and instance-conditioned way. Meanwhile, visual prompts are imposed based on the domain-agnostic textual prompt to elicit domain-invariant visual embeddings. These two branches of prompts are learned mutually with a cross-attention module and regularized with a semantic-consistency loss and an instance-discrimination contrastive loss. Experiments on three UDA benchmarks demonstrate the superiority of DAMP over state-of-the-art approaches.

Zhekai Du, Xinyao Li, Fengling Li, Ke Lu, Lei Zhu, Jingjing Li• 2024

Related benchmarks

Task	Dataset	Result
Unsupervised Domain Adaptation	Office-Home	Average Accuracy78.2	279
Image Classification	DomainNet	Accuracy (ClipArt)69.7	238
Domain Generalization	PACS (test)	Average Accuracy97.4	225
Image Classification	Office-Home	Average Accuracy87.1	167
Domain Generalization	Office-Home (test)	Average Accuracy85	121
Domain Generalization	VLCS (test)	--	62
Image Classification	Office-Home	Average Accuracy79.2	59
Image Classification	VisDA 17	Aero Accuracy98.7	47
Image Classification	VisDA (val)	Plane Accuracy97.3	44
Domain Generalization	TerraIncognita (test)	Accuracy53.7	40

Showing 10 of 25 rows

Other info

Follow for update

@wizwand_team Discord