RepAugment: Input-Agnostic Representation-Level Augmentation for Respiratory Sound Classification
About
Recent advancements in AI have democratized its deployment as a healthcare assistant. While pretrained models from large-scale visual and audio datasets have demonstrably generalized to this task, surprisingly, no studies have explored pretrained speech models, which, as human-originated sounds, intuitively would share closer resemblance to lung sounds. This paper explores the efficacy of pretrained speech models for respiratory sound classification. We find that there is a characterization gap between speech and lung sound samples, and to bridge this gap, data augmentation is essential. However, the most widely used augmentation technique for audio and speech, SpecAugment, requires 2-dimensional spectrogram format and cannot be applied to models pretrained on speech waveforms. To address this, we propose RepAugment, an input-agnostic representation-level augmentation technique that outperforms SpecAugment, but is also suitable for respiratory sound classification with waveform pretrained models. Experimental results show that our approach outperforms the SpecAugment, demonstrating a substantial improvement in the accuracy of minority disease classes, reaching up to 7.14%.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Respiratory sound classification | ICBHI dataset official (60-40% split) | Specificity82.47 | 54 | |
| Respiratory sound classification | ICBHI 2017 (official) | Specificity82.47 | 32 | |
| Respiratory sound classification | AKGC417L (IND) | Overall Score80.54 | 17 | |
| Respiratory sound classification | Yunting (IND) | Specificity87.3 | 11 | |
| Respiratory sound classification | Yunting (OOD) | Specificity84.77 | 11 | |
| Respiratory sound classification | LittC2SE (OOD) | Specificity95.24 | 11 | |
| Respiratory sound classification | Litt3200 OOD | Specificity16.72 | 11 | |
| Respiratory sound classification | AKGC417L OOD | Specificity98.4 | 11 | |
| Respiratory sound classification | Meditron OOD | Specificity (Sp)85.25 | 11 | |
| Respiratory sound classification | Setting #2 IND | Specificity87.16 | 11 |