Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

O-EENC-SD: Efficient Online End-to-End Neural Clustering for Speaker Diarization

About

We introduce O-EENC-SD: an end-to-end online speaker diarization system based on EEND-EDA, featuring a novel RNN-based stitching mechanism for online prediction. In particular, we develop a novel centroid refinement decoder whose usefulness is assessed through a rigorous ablation study. Our system provides key advantages over existing methods: a hyperparameter-free solution compared to unsupervised clustering approaches, and a more efficient alternative to current online end-to-end methods, which are computationally costly. We demonstrate that O-EENC-SD is competitive with the state of the art in the two-speaker conversational telephone speech domain, as tested on the CallHome dataset. Our results show that O-EENC-SD provides a great trade-off between DER and complexity, even when working on independent chunks with no overlap, making the system extremely efficient.

Elio Gruttadauria, Mathieu Fontaine, Jonathan Le Roux, Slim Essid• 2025

Related benchmarks

TaskDatasetResultRank
Speaker DiarizationCallHome 2-spks with 0.25s collar tolerance (test)
DER (%)9.33
21
Showing 1 of 1 rows

Other info

Follow for update