Toward Robustness in Multi-label Classification: A Data Augmentation Strategy against Imbalance and Noise

About

Multi-label classification poses challenges due to imbalanced and noisy labels in training data. We propose a unified data augmentation method, named BalanceMix, to address these challenges. Our approach includes two samplers for imbalanced labels, generating minority-augmented instances with high diversity. It also refines multi-labels at the label-wise granularity, categorizing noisy labels as clean, re-labeled, or ambiguous for robust optimization. Extensive experiments on three benchmark datasets demonstrate that BalanceMix outperforms existing state-of-the-art methods. We release the code at https://github.com/DISL-Lab/BalanceMix.

Hwanjun Song, Minseok Kim, Jae-Gil Lee• 2023

Related benchmarks

Task	Dataset	Result
Multi-Label Classification	PASCAL VOC 2007 (test)	mAP91.6	125
Multi-label Scene Classification	UCMerced	mAP (macro)90.41	105
Multi-label Scene Classification	AID-ML (test)	mAP (macro)69.67	105
Multilabel Classification	mediamill (test)	Macro F1 Score54.5	39
Multi-Label Classification	UCMerced	mAP (macro)90.28	35
Multi-Label Classification	COCO 2014 (test)	mAP66.5	31
Multi-Label Classification	Yeast (test)	Micro-F178.5	15
Multi-label Scene Classification	DeepGlobe-ML Subtractive Noise (test)	mAP macro (10% noise)75.15	7
Multi-label Scene Classification	DeepGlobe-ML Additive Noise (test)	mAP macro (10% noise)75.51	7
Multi-label Scene Classification	DeepGlobe-ML Mixed Noise (test)	mAP macro (10% noise)72.53	7

Showing 10 of 10 rows

Other info

Follow for update

@wizwand_team Discord