Empowering Source-Free Domain Adaptation via MLLM-Guided Reliability-Based Curriculum Learning
About
Existing SFDA methods struggle to fully use pre-trained knowledge and often rely on a single model's predictions or handcrafted prompts, limiting robustness under domain shift. Multimodal Large Language Models (MLLMs) offer a promising alternative: they encode rich visual-semantic knowledge and generalize well without task-specific tuning. However, their use in SFDA is hindered by instruction-following failures, inconsistent outputs, and high inference costs. We propose Reliability-based Curriculum Learning (RCL), a novel framework that distills robust supervision from multiple frozen MLLMs into a compact target model. RCL organizes adaptation as a three-stage curriculum that progressively incorporates pseudo-labels based on inter-model agreement and model confidence, enabling stable and noise-aware training. Our approach achieves state-of-the-art performance on standard SFDA datasets, Office-Home, DomainNet-126, and VisDA-C, outperforming zero-shot MLLMs, their ensembles, all without accessing source data or tuning foundation models. Our code is available at: https://github.com/Dong-Jie-Chen/RCL.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Image Classification | DomainNet (test) | Average Accuracy89.7 | 209 | |
| Domain Adaptation | Office-Home (test) | Mean Accuracy90.2 | 112 | |
| Unsupervised Domain Adaptation | DomainNet (test) | Average Accuracy89.7 | 97 | |
| Domain Adaptation | DomainNet (test) | Accuracy (C->P)88.1 | 12 | |
| Domain Adaptation | VisDA (test) | S→R Accuracy93.3 | 12 |