Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Prompting and Evaluating Large Language Models for Proactive Dialogues: Clarification, Target-guided, and Non-collaboration

About

Conversational systems based on Large Language Models (LLMs), such as ChatGPT, show exceptional proficiency in context understanding and response generation. However, despite their impressive capabilities, they still possess limitations, such as providing randomly-guessed answers to ambiguous queries or failing to refuse users' requests, both of which are considered aspects of a conversational agent's proactivity. This raises the question of whether LLM-based conversational systems are equipped to handle proactive dialogue problems. In this work, we conduct a comprehensive analysis of LLM-based conversational systems, specifically focusing on three aspects of proactive dialogue systems: clarification, target-guided, and non-collaborative dialogues. To trigger the proactivity of LLMs, we propose the Proactive Chain-of-Thought prompting scheme, which augments LLMs with the goal planning capability over descriptive reasoning chains. Empirical findings are discussed to promote future studies on LLM-based proactive dialogue systems.

Yang Deng, Lizi Liao, Liang Chen, Hongru Wang, Wenqiang Lei, Tat-Seng Chua• 2023

Related benchmarks

TaskDatasetResultRank
Question AnsweringARC Challenge--
749
Question AnsweringARC Easy
Accuracy79.6
386
Question AnsweringHotpotQA
Mean Per-Step Regret0.191
15
Question AnsweringSQuAD Abstract
Mean Per-Step Regret0.175
15
Multi-task Knowledge UnderstandingMMLU
Mean Per-Step Regret0.142
15
Multiple-choice Question AnsweringSciQ MC
Mean Per-Step Regret0.148
15
Question AnsweringSciQ Abstract
Mean per-step regret0.153
15
Question AnsweringARC Easy
Mean Regret0.115
15
Question AnsweringBoolQA
Mean Per-Step Regret0.199
15
Truthful Question AnsweringTruthfulQA
Mean Per-Step Regret0.166
15
Showing 10 of 20 rows

Other info

Follow for update