SUBARU: A Practical Approach to Power Saving in Hearables Using SUB-Nyquist Audio Resolution Upsampling
About
Hearables are wearable computers that are worn on the ear. Bone conduction microphones (BCMs) are used with air conduction microphones (ACMs) in hearables as a supporting modality for multimodal speech enhancement (SE) in noisy conditions. However, existing works don't consider the following practical aspects for low-power implementations on hearables: (i) They do not explore how lowering the sampling frequencies and bit resolutions in analog-to-digital converters (ADCs) of hearables jointly impact low-power processing and multimodal SE in terms of speech quality and intelligibility. And (iii) They don't process signals from ACMs/BCMs at a sub-Nyquist sampling rate because, in their frameworks, they lack a wideband reconstruction methodology from their narrowband parts. We propose SUBARU (\textbf{Sub}-Nyquist \textbf{A}udio \textbf{R}esolution \textbf{U}psampling), which achieves the following: SUBARU (i) intentionally uses sub-Nyquist sampling and low bit resolution in ADCs, achieving a 3.31x reduction in power consumption; and (ii) achieves streaming operations on mobile platforms and SE in in-the-wild noisy conditions with an inference time of 1.74ms and a memory footprint of less than 13.77MB.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Speech Enhancement | VCTK Vibration sensor 12-bit, 4-16 kHz upsampling (test) | LSD (Log-Spectral Distance)0.84 | 18 | |
| Speech Enhancement | VCTK Accelerometer 12-bit, 4-16 kHz upsampling (test) | LSD0.87 | 18 | |
| Bandwidth Extension (4-22 kHz upsampling) | MagnaTagATune (test) | LSD0.86 | 15 | |
| Bandwidth Extension (BWE) | VCTK Desktop | LSD0.82 | 10 | |
| Bandwidth Extension (BWE) | VCTK Google Pixel7 | LSD0.84 | 10 | |
| Bandwidth Extension (4-22 kHz upsampling) | VCTK (test) | LSD0.86 | 7 | |
| Bandwidth Extension (8-22 kHz upsampling) | VCTK (test) | LSD0.78 | 7 | |
| Bandwidth Extension (8-22 kHz upsampling) | MagnaTagATune (test) | LSD0.8 | 6 | |
| Speech Recognition | Vibration Sensor Audio Mobile Scenarios, Inside lab | CER7 | 4 | |
| Bandwidth extension | Noisy vibration sensor data | LSD0.85 | 2 |