Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Pushing the Limits of End-to-End Diarization

About

In this paper, we present state-of-the-art diarization error rates (DERs) on multiple publicly available datasets, including AliMeeting-far, AliMeeting-near, AMI-Mix, AMI-SDM, DIHARD III, and MagicData RAMC. Leveraging EEND-TA, a single unified non-autoregressive model for end-to-end speaker diarization, we achieve new benchmark results, most notably a DER of 14.49% on DIHARD III. Our approach scales pretraining through 8-speaker simulation mixtures, ensuring each generated speaker mixture configuration is sufficiently represented. These experiments highlight that EEND-based architectures possess a greater capacity for learning than previously explored, surpassing many existing diarization solutions while maintaining efficient speeds during inference.

Samuel J. Broughton, Lahiru Samarakoon• 2025

Related benchmarks

TaskDatasetResultRank
Speaker DiarizationAliMeeting (test)
DER0.1141
13
Showing 1 of 1 rows

Other info

Follow for update