Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Few-Shot Audio-Visual Learning of Environment Acoustics

About

Room impulse response (RIR) functions capture how the surrounding physical environment transforms the sounds heard by a listener, with implications for various applications in AR, VR, and robotics. Whereas traditional methods to estimate RIRs assume dense geometry and/or sound measurements throughout the environment, we explore how to infer RIRs based on a sparse set of images and echoes observed in the space. Towards that goal, we introduce a transformer-based method that uses self-attention to build a rich acoustic context, then predicts RIRs of arbitrary query source-receiver locations through cross-attention. Additionally, we design a novel training objective that improves the match in the acoustic signature between the RIR predictions and the targets. In experiments using a state-of-the-art audio-visual simulator for 3D environments, we demonstrate that our method successfully generates arbitrary RIRs, outperforming state-of-the-art methods and -- in a major departure from traditional methods -- generalizing to novel environments in a few-shot manner. Project: http://vision.cs.utexas.edu/projects/fs_rir.

Sagnik Majumder, Changan Chen, Ziad Al-Halah, Kristen Grauman• 2022

Related benchmarks

TaskDatasetResultRank
Novel-view Sound SynthesisSoundspace-Ambient (Unseen Scenes)
STFT5.457
15
Novel-view Sound SynthesisSoundspace-Ambient (Seen Scenes)
STFT5.937
15
Room Impulse Response (RIR) PredictionMatterport3D (Seen environments)
STFT1.1
9
Room Impulse Response (RIR) PredictionMatterport3D (Unseen environments)
STFT1.22
9
Binaural audio synthesisN2S (test)
STFT1.765
9
Novel-view Sound SynthesisN2S Benchmark real-world scene
STFT Error1.765
9
Depth Estimationenvironments (unseen)
DPE1.45
7
Sound Source Localizationenvironments (unseen)
SLE64.6
7
Sound Source LocalizationEnvironments (seen)
SLE50.3
6
Depth EstimationEnvironments (seen)
DPE135
6
Showing 10 of 10 rows

Other info

Code

Follow for update