Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

CPsyCoun: A Report-based Multi-turn Dialogue Reconstruction and Evaluation Framework for Chinese Psychological Counseling

About

Using large language models (LLMs) to assist psychological counseling is a significant but challenging task at present. Attempts have been made on improving empathetic conversations or acting as effective assistants in the treatment with LLMs. However, the existing datasets lack consulting knowledge, resulting in LLMs lacking professional consulting competence. Moreover, how to automatically evaluate multi-turn dialogues within the counseling process remains an understudied area. To bridge the gap, we propose CPsyCoun, a report-based multi-turn dialogue reconstruction and evaluation framework for Chinese psychological counseling. To fully exploit psychological counseling reports, a two-phase approach is devised to construct high-quality dialogues while a comprehensive evaluation benchmark is developed for the effective automatic evaluation of multi-turn psychological consultations. Competitive experimental results demonstrate the effectiveness of our proposed framework in psychological counseling. We open-source the datasets and model for future research at https://github.com/CAS-SIAT-XinHai/CPsyCoun

Chenhao Zhang, Renhao Li, Minghuan Tan, Min Yang, Jingwei Zhu, Di Yang, Jiahao Zhao, Guancheng Ye, Chengming Li, Xiping Hu• 2024

Related benchmarks

TaskDatasetResultRank
Multi-turn Psychological Counseling Dialogue GenerationCPsyCoun
ROUGE-123.71
16
Emotional Support ConversationSAGE (test)
Sentience20.35
14
Psychological CounselingCPsyCounR Single-Session
T.Alli0.374
12
Psychological CounselingCPsyCounR Multi-Session
Coherence (Coh)0.64
12
Therapeutic Competence EvaluationPSYCHEPASS
Elo Score44
12
Multi-session Psychological CounselingPsychEval
Counselor-level Shared Score4.21
10
Psychological Counseling Dialogue EvaluationCPsyCoun (test)
Comprehensiveness1.39
5
Expert EvaluationPsychological Counseling Datasets
Coherence5.57
4
Human agreement evaluationPsyCrisis (sampled)
Pearson R (Overall)0.1524
3
Showing 9 of 9 rows

Other info

Code

Follow for update