Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Better Intermediates Improve CTC Inference

About

This paper proposes a method for improved CTC inference with searched intermediates and multi-pass conditioning. The paper first formulates self-conditioned CTC as a probabilistic model with an intermediate prediction as a latent representation and provides a tractable conditioning framework. We then propose two new conditioning methods based on the new formulation: (1) Searched intermediate conditioning that refines intermediate predictions with beam-search, (2) Multi-pass conditioning that uses predictions of previous inference for conditioning the next inference. These new approaches enable better conditioning than the original self-conditioned CTC during inference and improve the final performance. Experiments with the LibriSpeech dataset show relative 3%/12% performance improvement at the maximum in test clean/other sets compared to the original self-conditioned CTC.

Tatsuya Komatsu, Yusuke Fujita, Jaesong Lee, Lukas Lee, Shinji Watanabe, Yusuke Kida• 2022

Related benchmarks

TaskDatasetResultRank
Speech RecognitionLibrispeech (100 hours) and AISHELL v1
WER (English)7.11
5
Showing 1 of 1 rows

Other info

Follow for update