AV-NeRF: Learning Neural Fields for Real-World Audio-Visual Scene Synthesis
About
Can machines recording an audio-visual scene produce realistic, matching audio-visual experiences at novel positions and novel view directions? We answer it by studying a new task -- real-world audio-visual scene synthesis -- and a first-of-its-kind NeRF-based approach for multimodal learning. Concretely, given a video recording of an audio-visual scene, the task is to synthesize new videos with spatial audios along arbitrary novel camera trajectories in that scene. We propose an acoustic-aware audio generation module that integrates prior knowledge of audio propagation into NeRF, in which we implicitly associate audio generation with the 3D geometry and material properties of a visual environment. Furthermore, we present a coordinate transformation module that expresses a view direction relative to the sound source, enabling the model to learn sound source-centric acoustic fields. To facilitate the study of this new task, we collect a high-quality Real-World Audio-Visual Scene (RWAVS) dataset. We demonstrate the advantages of our method on this real-world dataset and the simulation-based SoundSpaces dataset.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Novel-view Sound Synthesis | Soundspace-Ambient (Seen Scenes) | STFT9.424 | 15 | |
| Novel-view Sound Synthesis | Soundspace-Ambient (Unseen Scenes) | STFT9.321 | 15 | |
| Room Impulse Response Synthesis | RIR Synthesis Dataset 16 kHz sampling rate 1.0 (test) | STFT Error (dB)0.46 | 13 | |
| 3D Acoustic Field Modeling | RAF 48 kHz (test) | STFT Error (dB)0.39 | 13 | |
| Binaural audio synthesis | N2S (test) | STFT2.194 | 9 | |
| Novel-view Sound Synthesis | N2S Benchmark real-world scene | STFT Error2.194 | 9 | |
| Novel View Acoustic Synthesis | RWAVS Office | MAG0.93 | 8 | |
| Novel View Acoustic Synthesis | RWAVS House | MAG2.009 | 8 | |
| Novel View Acoustic Synthesis | RWAVS Apartment | MAG Score2.23 | 8 | |
| Novel View Acoustic Synthesis | RWAVS Outdoors | MAG0.845 | 8 |