Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Speech Emotion Recognition with ASR Integration

About

Speech Emotion Recognition (SER) plays a pivotal role in understanding human communication, enabling emotionally intelligent systems, and serving as a fundamental component in the development of Artificial General Intelligence (AGI). However, deploying SER in real-world, spontaneous, and low-resource scenarios remains a significant challenge due to the complexity of emotional expression and the limitations of current speech and language technologies. This thesis investigates the integration of Automatic Speech Recognition (ASR) into SER, with the goal of enhancing the robustness, scalability, and practical applicability of emotion recognition from spoken language.

Yuanchao Li• 2026

Related benchmarks

TaskDatasetResultRank
Emotion RecognitionIEMOCAP--
71
Multimodal Sentiment AnalysisCMU-MOSI
MAE0.8557
59
Sentiment AnalysisCMU-MOSI
Accuracy (2-class)85.1
21
Humor DetectionUR-FUNNY
ACC275.09
20
Emotion RecognitionCMU-MOSEI
F1 Score84
19
Sarcasm DetectionMUSTARD
Accuracy76.62
13
Dementia detectionDementia detection dataset (train)
Unweighted Average Accuracy80.87
9
Emotion RecognitionEmotion recognition dataset (train)
UA (%)75.1
9
ASR Error CorrectionASR Error Correction Evaluation Set (test)
WER16.07
6
Speech Emotion RecognitionMSP-Podcast
WER12.85
3
Showing 10 of 10 rows

Other info

Follow for update