Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

SBAAM! Eliminating Transcript Dependency in Automatic Subtitling

About

Subtitling plays a crucial role in enhancing the accessibility of audiovisual content and encompasses three primary subtasks: translating spoken dialogue, segmenting translations into concise textual units, and estimating timestamps that govern their on-screen duration. Past attempts to automate this process rely, to varying degrees, on automatic transcripts, employed diversely for the three subtasks. In response to the acknowledged limitations associated with this reliance on transcripts, recent research has shifted towards transcription-free solutions for translation and segmentation, leaving the direct generation of timestamps as uncharted territory. To fill this gap, we introduce the first direct model capable of producing automatic subtitles, entirely eliminating any dependence on intermediate transcripts also for timestamp prediction. Experimental results, backed by manual evaluation, showcase our solution's new state-of-the-art performance across multiple language pairs and diverse conditions.

Marco Gaido, Sara Papi, Matteo Negri, Mauro Cettolo, Luisa Bentivogli• 2024

Related benchmarks

TaskDatasetResultRank
Automatic SubtitlingMuST-Cinema (MSTCIN) en-de (test)
SubER (cased)56.2
6
Automatic SubtitlingMuST-Cinema (MSTCIN) en-es (test)
SubER (cased)44.7
6
Automatic SubtitlingEC Short Clips (ECSC) en-de (test)
SubER (cased)58.5
6
Automatic SubtitlingOverall Average across MSTCIN, ECSC, and EPI (test)
Subtitling Error Rate (AVG)59.2
6
Automatic SubtitlingEC Short Clips (ECSC) en-es (test)
SubER (cased)49.9
6
Automatic SubtitlingEP Interviews (EPI) en-de (test)
SubER (cased)78.5
6
Automatic SubtitlingEP Interviews (EPI) en-es (test)
SubER (cased)70.2
6
Automatic SubtitlingIWSLT en-de 2023 (val)
SubER (TED uncased)62.1
2
Automatic SubtitlingIWSLT en-es 2023 (val)
SubER (TED cased)49.5
2
Showing 9 of 9 rows

Other info

Code

Follow for update