Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Sequence Transduction with Recurrent Neural Networks

About

Many machine learning tasks can be expressed as the transformation---or \emph{transduction}---of input sequences into output sequences: speech recognition, machine translation, protein secondary structure prediction and text-to-speech to name but a few. One of the key challenges in sequence transduction is learning to represent both the input and output sequences in a way that is invariant to sequential distortions such as shrinking, stretching and translating. Recurrent neural networks (RNNs) are a powerful sequence learning architecture that has proven capable of learning such representations. However RNNs traditionally require a pre-defined alignment between the input and output sequences to perform transduction. This is a severe limitation since \emph{finding} the alignment is the most difficult aspect of many sequence transduction problems. Indeed, even determining the length of the output sequence is often challenging. This paper introduces an end-to-end, probabilistic sequence transduction system, based entirely on RNNs, that is in principle able to transform any input sequence into any finite, discrete output sequence. Experimental results for phoneme recognition are provided on the TIMIT speech corpus.

Alex Graves• 2012

Related benchmarks

TaskDatasetResultRank
Math ReasoningGSM8K (test)
Accuracy82.3
155
Commonsense ReasoningStrategyQA (test)
Accuracy72.9
81
Automatic Speech RecognitionAISHELL-1 (test)
CER6.96
71
Speech-to-speech translationFisher Spanish-English (test)--
55
Math ReasoningMATH 200 samples (test)
Accuracy48
36
Automatic Speech RecognitionAISHELL-1 (dev)
CER5.8
34
Object HallucinationCOCO 512-token budget (test)
Consistency Score56.5
24
Object HallucinationCOCO 64-token budget (test)
CS0.217
24
Simultaneous Speech TranslationCallHome Spanish-English Es-En (test)
BLEU12
18
Automatic Speech RecognitionEnglish Hardcase (test)
F1 Score90.42
7
Showing 10 of 13 rows

Other info

Follow for update