Joint Turn and Dialogue level User Satisfaction Estimation on Multi-Domain Conversations

About

Dialogue level quality estimation is vital for optimizing data driven dialogue management. Current automated methods to estimate turn and dialogue level user satisfaction employ hand-crafted features and rely on complex annotation schemes, which reduce the generalizability of the trained models. We propose a novel user satisfaction estimation approach which minimizes an adaptive multi-task loss function in order to jointly predict turn-level Response Quality labels provided by experts and explicit dialogue-level ratings provided by end users. The proposed BiLSTM based deep neural net model automatically weighs each turn's contribution towards the estimated dialogue-level rating, implicitly encodes temporal dependencies, and removes the need to hand-craft features. On dialogues sampled from 28 Alexa domains, two dialogue systems and three user groups, the joint dialogue-level satisfaction estimation model achieved up to an absolute 27% (0.43->0.70) and 7% (0.63->0.70) improvement in linear correlation performance over baseline deep neural net and benchmark Gradient boosting regression models, respectively.

Praveen Kumar Bodigutla, Aditya Tiwari, Josep Valls Vargas, Lazaros Polymenakos, Spyros Matsoukas• 2020

Related benchmarks

Task	Dataset	Result
User Satisfaction Estimation	MWOZ	Accuracy47.6	14
User Satisfaction Estimation	SGD	Accuracy57.4	14
User Satisfaction Estimation	JDDC	Accuracy58.3	14
User Satisfaction Estimation	Bing Copilot 0.8% training size (test)	Precision57.7	8
User Satisfaction Estimation	MWOZ 5% training size (test)	Precision33.3	8
User Satisfaction Estimation	SGD 5% training size (test)	Precision49.6	8
User Satisfaction Estimation	ReDial 5% training size (test)	Precision40.6	8

Showing 7 of 7 rows

Other info

Follow for update

@wizwand_team Discord