Learning Neural Acoustic Fields
About
Our environment is filled with rich and dynamic acoustic information. When we walk into a cathedral, the reverberations as much as appearance inform us of the sanctuary's wide open space. Similarly, as an object moves around us, we expect the sound emitted to also exhibit this movement. While recent advances in learned implicit functions have led to increasingly higher quality representations of the visual world, there have not been commensurate advances in learning spatial auditory representations. To address this gap, we introduce Neural Acoustic Fields (NAFs), an implicit representation that captures how sounds propagate in a physical scene. By modeling acoustic propagation in a scene as a linear time-invariant system, NAFs learn to continuously map all emitter and listener location pairs to a neural impulse response function that can then be applied to arbitrary sounds. We demonstrate that the continuous nature of NAFs enables us to render spatial acoustics for a listener at an arbitrary location, and can predict sound propagation at novel locations. We further show that the representation learned by NAFs can help improve visual learning with sparse views. Finally, we show that a representation informative of scene structure emerges during the learning of NAFs.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| 3D Acoustic Field Modeling | RAF 48 kHz (test) | STFT Error (dB)0.64 | 13 | |
| Room Impulse Response Synthesis | RIR Synthesis Dataset 16 kHz sampling rate 1.0 (test) | STFT Error (dB)0.77 | 13 | |
| Novel View Acoustic Synthesis | RWAVS Office | MAG1.244 | 8 | |
| Novel View Acoustic Synthesis | RWAVS House | MAG3.259 | 8 | |
| Novel View Acoustic Synthesis | RWAVS Apartment | MAG Score3.345 | 8 | |
| Novel View Acoustic Synthesis | RWAVS Outdoors | MAG1.284 | 8 | |
| Novel View Acoustic Synthesis | RWAVS Overall | MAG2.283 | 8 | |
| Acoustic Impulse Response Prediction | SoundSpaces Large 1 (test) | Spectral Loss0.396 | 7 | |
| Acoustic Impulse Response Prediction | SoundSpaces Large 2 (test) | Spectral Loss0.413 | 7 | |
| Acoustic Impulse Response Prediction | SoundSpaces Medium 1 (test) | Spectral Loss0.384 | 7 |