NDF+: Joint Neural Directional Filtering and Diffuse Sound Extraction
About
Recently, neural directional filtering (NDF) has been introduced as a flexible approach for reconstructing a virtual directional microphone (VDM) with a desired directivity pattern for spatial sound capture. Building on this idea, we propose NDF+, which enables joint neural directional filtering and diffuse sound extraction. NDF+ reformulates VDM estimation into two coupled subtasks: dereverberated VDM reconstruction and diffuse sound extraction. This reformulation enables NDF+ to manipulate diffuse components in the final reconstructed VDM output. We evaluated NDF+ under reverberant conditions and compared it with representative conventional baselines. Results show that NDF+ consistently outperforms the baselines on both subtasks, while maintaining VDM reconstruction quality comparable to that of the original single-task NDF model. These findings indicate that NDF+ introduces an additional degree of freedom for diffuse sound control in the VDM reconstruction. In a stereo recording application, NDF+ provides controllable inter-channel level differences between left and right channels by adjusting the estimated diffuse component.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| VDM reconstruction | EARS RT60 = 0.2 s (test) | SDR21.42 | 13 | |
| VDM reconstruction | EARS RT60 = 0.4 s (test) | SDR17.98 | 13 | |
| VDM reconstruction | EARS RT60 = 0.6 s (test) | SDR16.44 | 13 | |
| Diffuse sound extraction | EARS RT60 = 0.2 s (test) | SDR3.99 | 5 | |
| Diffuse sound extraction | EARS RT60 = 0.4 s (test) | SDR7.26 | 5 | |
| Diffuse sound extraction | EARS RT60 = 0.6 s (test) | SDR8.22 | 5 |