RepAugment: Input-Agnostic Representation-Level Augmentation for Respiratory Sound Classification

About

Recent advancements in AI have democratized its deployment as a healthcare assistant. While pretrained models from large-scale visual and audio datasets have demonstrably generalized to this task, surprisingly, no studies have explored pretrained speech models, which, as human-originated sounds, intuitively would share closer resemblance to lung sounds. This paper explores the efficacy of pretrained speech models for respiratory sound classification. We find that there is a characterization gap between speech and lung sound samples, and to bridge this gap, data augmentation is essential. However, the most widely used augmentation technique for audio and speech, SpecAugment, requires 2-dimensional spectrogram format and cannot be applied to models pretrained on speech waveforms. To address this, we propose RepAugment, an input-agnostic representation-level augmentation technique that outperforms SpecAugment, but is also suitable for respiratory sound classification with waveform pretrained models. Experimental results show that our approach outperforms the SpecAugment, demonstrating a substantial improvement in the accuracy of minority disease classes, reaching up to 7.14%.

June-Woo Kim, Miika Toikkanen, Sangmin Bae, Minseok Kim, Ho-Young Jung• 2024

Related benchmarks

Task	Dataset	Result
Respiratory sound classification	ICBHI dataset official (60-40% split)	Specificity82.47	54
Respiratory sound classification	ICBHI 2017 (official)	Specificity82.47	32
Respiratory sound classification	AKGC417L (IND)	Overall Score80.54	17
Respiratory sound classification	Yunting (IND)	Specificity87.3	11
Respiratory sound classification	Yunting (OOD)	Specificity84.77	11
Respiratory sound classification	LittC2SE (OOD)	Specificity95.24	11
Respiratory sound classification	Litt3200 OOD	Specificity16.72	11
Respiratory sound classification	AKGC417L OOD	Specificity98.4	11
Respiratory sound classification	Meditron OOD	Specificity (Sp)85.25	11
Respiratory sound classification	Setting #2 IND	Specificity87.16	11

Showing 10 of 12 rows

Other info

Follow for update

@wizwand_team Discord