On Multi-Domain Long-Tailed Recognition, Imbalanced Domain Generalization and Beyond
About
Real-world data often exhibit imbalanced label distributions. Existing studies on data imbalance focus on single-domain settings, i.e., samples are from the same data distribution. However, natural data can originate from distinct domains, where a minority class in one domain could have abundant instances from other domains. We formalize the task of Multi-Domain Long-Tailed Recognition (MDLT), which learns from multi-domain imbalanced data, addresses label imbalance, domain shift, and divergent label distributions across domains, and generalizes to all domain-class pairs. We first develop the domain-class transferability graph, and show that such transferability governs the success of learning in MDLT. We then propose BoDA, a theoretically grounded learning strategy that tracks the upper bound of transferability statistics, and ensures balanced alignment and calibration across imbalanced domain-class distributions. We curate five MDLT benchmarks based on widely-used multi-domain datasets, and compare BoDA to twenty algorithms that span different learning strategies. Extensive and rigorous experiments verify the superior performance of BoDA. Further, as a byproduct, BoDA establishes new state-of-the-art on Domain Generalization benchmarks, highlighting the importance of addressing data imbalance across domains, which can be crucial for improving generalization to unseen domains. Code and data are available at: https://github.com/YyzHarry/multi-domain-imbalance.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Image Classification | PACS | Overall Average Accuracy65.7 | 230 | |
| Diabetic Retinopathy Classification | DEEPDR (test) | Accuracy0.502 | 30 | |
| Image Classification | PACS TotalHeavyTail setting (test) | Overall Accuracy0.741 | 24 | |
| Image Classification | VLCS | Average Accuracy58.2 | 24 | |
| Image Classification | OfficeHome TotalHeavyTail setting (test) | Avg Accuracy47.1 | 24 | |
| Image Classification | OfficeHome | Average Accuracy53.5 | 24 | |
| Image Classification | VLCS GINIDG setting (test) | Average Accuracy76.3 | 24 | |
| Diabetic Retinopathy Grading | APTOS ESDG (test) | AUC67.6 | 24 | |
| Image Classification | VLCS TotalHeavyTail setting (test) | Average Accuracy73 | 24 | |
| Diabetic Retinopathy Grading | FGADR ESDG (test) | AUC57.1 | 24 |