Supervised Prototypical Contrastive Learning for Emotion Recognition in Conversation
About
Capturing emotions within a conversation plays an essential role in modern dialogue systems. However, the weak correlation between emotions and semantics brings many challenges to emotion recognition in conversation (ERC). Even semantically similar utterances, the emotion may vary drastically depending on contexts or speakers. In this paper, we propose a Supervised Prototypical Contrastive Learning (SPCL) loss for the ERC task. Leveraging the Prototypical Network, the SPCL targets at solving the imbalanced classification problem through contrastive learning and does not require a large batch size. Meanwhile, we design a difficulty measure function based on the distance between classes and introduce curriculum learning to alleviate the impact of extreme samples. We achieve state-of-the-art results on three widely used benchmarks. Further, we conduct analytical experiments to demonstrate the effectiveness of our proposed SPCL and curriculum learning strategy. We release the code at https://github.com/caskcsg/SPCL.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Emotion Recognition in Conversation | IEMOCAP (test) | Weighted Average F1 Score69.74 | 154 | |
| Emotion Recognition in Conversation | MELD | Weighted Avg F166.13 | 137 | |
| Conversational Emotion Recognition | IEMOCAP | Weighted Average F1 Score68.42 | 129 | |
| Emotion Recognition in Conversation | MELD (test) | Weighted F167.25 | 118 | |
| Emotion Detection | EmoryNLP (test) | Weighted-F10.4094 | 96 | |
| Dialogue Emotion Detection | EmoryNLP | Weighted Avg F140.25 | 80 | |
| Emotion Recognition in Conversation | MELD standard (test) | Weighted F166.35 | 19 |