Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Dolphin-CN-Dialect: Where Chinese Dialects Matter

About

We present Dolphin-CN-Dialect, a streaming-capable ASR model with a focus on Chinese and dialect-rich scenarios. Compared to the previous version, Dolphin-CN-Dialect introduces substantial improvements in data processing, tokenization, training stability, and data sampling strategies. To address the challenges of highly imbalanced dialect data, we propose a temperature-based sampling strategy that effectively balances standard Mandarin and low-resource dialects, leading to significant gains in dialect recognition performance. In addition, we redesign the tokenizer to better align with linguistic characteristics, adopting character-level modeling for Chinese and subword modeling for English, while introducing extensible dialect tokens. Experimental results show that Dolphin-CN-Dialect achieves improvement in dialect recognition accuracy and CER reduction compared to Dolphin. Furthermore, Dolphin-CN-Dialect reaches competitive performance with recent SOTA open-source ASR models, while maintaining a significantly smaller model size. Dolphin-CN-Dialect supports both streaming and non-streaming inference, enabling a practical balance between latency and accuracy. It also provides flexible customization through hotword support and efficient deployment optimized for specialized hardware. These improvements make Dolphin-CN-Dialect a strong and practical solution for real-world multi-dialect ASR applications.

Yangyang Meng, Huihang Zhong, Guodong Lin, Guanbo Wang, Hu Du, Zhiming Shao, Yukai Huang, Ke Li, Wei-Qiang Zhang• 2026

Related benchmarks

TaskDatasetResultRank
Automatic Speech RecognitionKeSpeech
CER5.04
35
Speech RecognitionHaitan internal tw (test)
CER6.68
10
Speech RecognitionHaitan internal sichuan (test)
CER9.63
10
Speech RecognitionHaitan internal wu (test)
CER9.49
10
Speech RecognitionHaitan internal minnan (test)
CER20.74
10
Speech RecognitionHaitan internal liaoning (test)
CER3.25
10
Speech RecognitionHaitan internal fujian (test)
CER3.62
10
Speech RecognitionHaitan internal hunan (test)
CER11.89
10
Speech RecognitionHaitan internal guangdong (test)
CER6.03
10
Speech RecognitionHaitan internal (wenzhou) (test)
CER (%)2.25
10
Showing 10 of 24 rows

Other info

Follow for update