Prompting and Evaluating Large Language Models for Proactive Dialogues: Clarification, Target-guided, and Non-collaboration

About

Conversational systems based on Large Language Models (LLMs), such as ChatGPT, show exceptional proficiency in context understanding and response generation. However, despite their impressive capabilities, they still possess limitations, such as providing randomly-guessed answers to ambiguous queries or failing to refuse users' requests, both of which are considered aspects of a conversational agent's proactivity. This raises the question of whether LLM-based conversational systems are equipped to handle proactive dialogue problems. In this work, we conduct a comprehensive analysis of LLM-based conversational systems, specifically focusing on three aspects of proactive dialogue systems: clarification, target-guided, and non-collaborative dialogues. To trigger the proactivity of LLMs, we propose the Proactive Chain-of-Thought prompting scheme, which augments LLMs with the goal planning capability over descriptive reasoning chains. Empirical findings are discussed to promote future studies on LLM-based proactive dialogue systems.

Yang Deng, Lizi Liao, Liang Chen, Hongru Wang, Wenqiang Lei, Tat-Seng Chua• 2023

Related benchmarks

Task	Dataset	Result
Question Answering	ARC Challenge	--	906
Question Answering	ARC Easy	Accuracy79.6	597
Dialogue Response Generation	Chronicle	B-429.2	38
Dialogue Response Generation	MSC	B-4 Score32.5	38
Response Generation	Chronicle and MSC Average	CEA44	30
Active Reasoning	AR-Bench-DC	Exact Accuracy49	23
Charity Persuasion	P4G User Simulation	Success Rate (SR)68	16
Question Answering	HotpotQA	Mean Per-Step Regret0.191	15
Question Answering	SQuAD Abstract	Mean Per-Step Regret0.175	15
Multi-task Knowledge Understanding	MMLU	Mean Per-Step Regret0.142	15

Showing 10 of 39 rows

Other info

Follow for update

@wizwand_team Discord