Mind the Pause: Disfluency-Aware Objective Tuning for Multilingual Speech Correction with LLMs

About

Automatic Speech Recognition (ASR) transcripts often contain disfluencies, such as fillers, repetitions, and false starts, which reduce readability and hinder downstream applications like chatbots and voice assistants. If left unaddressed, such disfluencies can significantly degrade the reliability of downstream systems. Most existing approaches rely on classical models that focus on identifying disfluent tokens for removal. While this strategy is effective to some extent, it often disrupts grammatical structure and semantic coherence, leading to incomplete or unnatural sentences. Recent literature explored the use of large language models (LLMs); however, these efforts have primarily focused on disfluency detection or data augmentation, rather than performing comprehensive correction. We propose a multilingual correction pipeline where a sequence tagger first marks disfluent tokens, and these signals guide instruction fine-tuning of an LLM to rewrite transcripts into fluent text. To further improve reliability, we add a contrastive learning objective that penalizes the reproduction of disfluent tokens, encouraging the model to preserve grammar and meaning while removing disfluent artifacts. Our experiments across three Indian languages, namely Hindi, Bengali, and Marathi show consistent improvements over strong baselines, including multilingual sequence-to-sequence models. These results highlight that detection-only strategies are insufficient. Combining token-level cues with instruction tuning and contrastive learning provides a practical and scalable solution for multilingual disfluency correction in speech-driven NLP systems. We make the codes publicly available at https://github.com/deepak-kumar-98/Mind-the-Pause.

Deepak Kumar, Baban Gain, Asif Ekbal• 2026

Related benchmarks

Task	Dataset	Result
Disfluency Correction	Hindi Real Data ASR (test)	BLEU91.1	6
Disfluency Correction	Bengali Real Data ASR (test)	BLEU75.9	6
Disfluency Correction	Marathi Real Data ASR (test)	BLEU84.4	6
Disfluency Correction	Hindi Manually Edited (test)	BLEU96.1	5
Disfluency Correction	Bengali Manually Edited (test)	BLEU96.4	5
Disfluency Correction	Marathi Manually Edited (test)	BLEU95.1	5
Disfluency Correction	Hindi Real Data ASR	BLEU90.4	4
Disfluency Correction	Bengali Manually Edited	BLEU94.8	4
Disfluency Correction	Marathi Real Data ASR	BLEU83.6	4
Disfluency Correction	Hindi Manually Edited	Proposed (%)18.1	2

Showing 10 of 15 rows

Other info

Follow for update

@wizwand_team Discord