Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Real Acoustic Fields: An Audio-Visual Room Acoustics Dataset and Benchmark

About

We present a new dataset called Real Acoustic Fields (RAF) that captures real acoustic room data from multiple modalities. The dataset includes high-quality and densely captured room impulse response data paired with multi-view images, and precise 6DoF pose tracking data for sound emitters and listeners in the rooms. We used this dataset to evaluate existing methods for novel-view acoustic synthesis and impulse response generation which previously relied on synthetic data. In our evaluation, we thoroughly assessed existing audio and audio-visual models against multiple criteria and proposed settings to enhance their performance on real-world data. We also conducted experiments to investigate the impact of incorporating visual data (i.e., images and depth) into neural acoustic field models. Additionally, we demonstrated the effectiveness of a simple sim2real approach, where a model is pre-trained with simulated data and fine-tuned with sparse real-world data, resulting in significant improvements in the few-shot learning approach. RAF is the first dataset to provide densely captured room acoustic data, making it an ideal resource for researchers working on audio and audio-visual neural acoustic field modeling techniques. Demos and datasets are available on our project page: https://facebookresearch.github.io/real-acoustic-fields/

Ziyang Chen, Israel D. Gebru, Christian Richardt, Anurag Kumar, William Laney, Andrew Owens, Alexander Richard• 2024

Related benchmarks

TaskDatasetResultRank
3D Acoustic Field ModelingRAF 48 kHz (test)
STFT Error (dB)0.39
13
Room Impulse Response SynthesisRIR Synthesis Dataset 16 kHz sampling rate 1.0 (test)--
13
Showing 2 of 2 rows

Other info

Code

Follow for update