The Emotion is Not One-hot Encoding: Learning with Grayscale Label for Emotion Recognition in Conversation
About
In emotion recognition in conversation (ERC), the emotion of the current utterance is predicted by considering the previous context, which can be utilized in many natural language processing tasks. Although multiple emotions can coexist in a given sentence, most previous approaches take the perspective of a classification task to predict only a given label. However, it is expensive and difficult to label the emotion of a sentence with confidence or multi-label. In this paper, we automatically construct a grayscale label considering the correlation between emotions and use it for learning. That is, instead of using a given label as a one-hot encoding, we construct a grayscale label by measuring scores for different emotions. We introduce several methods for constructing grayscale labels and confirm that each method improves the emotion recognition performance. Our method is simple, effective, and universally applicable to previous systems. The experiments show a significant improvement in the performance of baselines.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Emotion Recognition in Conversation | MELD | Weighted Avg F166.5 | 137 | |
| Conversational Emotion Recognition | IEMOCAP | Weighted Average F1 Score68.57 | 129 | |
| Dialogue Emotion Detection | EmoryNLP | Weighted Avg F140.23 | 80 | |
| Dialogue Emotion Detection | DailyDialog | Micro F1 (- neutral)0.6167 | 27 |