Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Physics-Guided Variational Model for Unsupervised Sound Source Tracking

About

Sound source tracking is commonly performed using classical array-processing algorithms, while machine-learning approaches typically rely on precise source position labels that are expensive or impractical to obtain. This paper introduces a physics-guided variational model capable of fully unsupervised single-source sound source tracking. The method combines a variational encoder with a physics-based decoder that injects geometric constraints into the latent space through analytically derived pairwise time-delay likelihoods. Without requiring ground-truth labels, the model learns to estimate source directions directly from microphone array signals. Experiments on real-world data demonstrate that the proposed approach outperforms traditional baselines and achieves accuracy and computational complexity comparable to state-of-the-art supervised models. We further show that the method generalizes well to mismatched array geometries and exhibits strong robustness to corrupted microphone position metadata. Finally, we outline a natural extension of the approach to multi-source tracking and present the theoretical modifications required to support it.

Luan Vin\'icius Fiorio, Ivana Nikoloska, Bruno Defraene, Alex Young, Johan David, Ronald M. Aarts• 2026

Related benchmarks

TaskDatasetResultRank
Direction of Arrival EstimationSimulated data (Experiment 3)
RMSAE6.3
4
Direction of Arrival EstimationLOCATA Experiment 3
RMS Angular Error8.2
4
Direction of Arrival EstimationSimulated data Experiment 1
RMSAE5.5
4
Direction of Arrival EstimationLOCATA Experiment 1
RMSAE8.8
4
DOA estimationSimulated data Experiment 2 - directional AWGN
RMS Angular Error3.8
4
DOA estimationLOCATA Experiment 2 - directional AWGN
RMSAE8
4
Direction of Arrival EstimationLOCATA (test)
Params (M)0.89
3
Showing 7 of 7 rows

Other info

Follow for update