Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Self-Speculative Decoding for LLM-based ASR with CTC Encoder Drafts

About

We propose self-speculative decoding for speech-aware LLMs by using the CTC encoder as a draft model to accelerate auto-regressive (AR) inference and improve ASR accuracy. Our three-step procedure works as follows: (1) if the frame entropies of the CTC output distributions are below a threshold, the greedy CTC hypothesis is accepted as final; (2) otherwise, the CTC hypothesis is verified in a single LLM forward pass using a relaxed acceptance criterion based on token likelihoods; (3) if verification fails, AR decoding resumes from the accepted CTC prefix. Experiments on nine corpora and five languages show that this approach can simultaneously accelerate decoding and reduce WER. On the HuggingFace Open ASR benchmark with a 1B parameter LLM and 440M parameter CTC encoder, we achieve a record 5.58% WER and improve the inverse real time factor by a factor of 4.4 with only a 12% relative WER increase over AR search. Code and model weights are publicly available under a permissive license.

George Saon, Samuel Thomas, Takashi Fukuda, Tohru Nagano, Avihu Dekel, Luis Lastras• 2026

Related benchmarks

TaskDatasetResultRank
Speech RecognitionMultilingual LibriSpeech (MLS) (test)
WER0.0322
21
Automatic Speech RecognitionCommonVoice 17.0 (test)
Word Error Rate (WER)2.8
18
Automatic Speech RecognitionAMI IHM English Open ASR (test)
WER8.31
3
Automatic Speech RecognitionEarnings22 English Open ASR (test)
WER8.96
3
Automatic Speech RecognitionGigaSpeech English Open ASR (test)
WER9.95
3
Automatic Speech RecognitionLS Clean English Open ASR (test)
WER1.37
3
Automatic Speech RecognitionLS Other English Open ASR (test)
WER2.88
3
Automatic Speech RecognitionSPGISpeech English Open ASR (test)
Word Error Rate (WER)3.8
3
Automatic Speech RecognitionTedlium English Open ASR (test)
WER3.35
3
Automatic Speech RecognitionVoxPopuli English Open ASR (test)
WER5.99
3
Showing 10 of 10 rows

Other info

Follow for update