Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Think Twice: Perspective-Taking Improves Large Language Models' Theory-of-Mind Capabilities

About

Human interactions are deeply rooted in the interplay of thoughts, beliefs, and desires made possible by Theory of Mind (ToM): our cognitive ability to understand the mental states of ourselves and others. Although ToM may come naturally to us, emulating it presents a challenge to even the most advanced Large Language Models (LLMs). Recent improvements to LLMs' reasoning capabilities from simple yet effective prompting techniques such as Chain-of-Thought have seen limited applicability to ToM. In this paper, we turn to the prominent cognitive science theory "Simulation Theory" to bridge this gap. We introduce SimToM, a novel two-stage prompting framework inspired by Simulation Theory's notion of perspective-taking. To implement this idea on current ToM benchmarks, SimToM first filters context based on what the character in question knows before answering a question about their mental state. Our approach, which requires no additional training and minimal prompt-tuning, shows substantial improvement over existing methods, and our analysis reveals the importance of perspective-taking to Theory-of-Mind capabilities. Our findings suggest perspective-taking as a promising direction for future research into improving LLMs' ToM capabilities.

Alex Wilf, Sihyun Shawn Lee, Paul Pu Liang, Louis-Philippe Morency• 2023

Related benchmarks

TaskDatasetResultRank
Theory of MindHiToM
Accuracy71
64
Theory of MindToMi
Accuracy79.9
55
Theory of MindBigToM
Accuracy77.5
48
Theory of Mind reasoningMMToM-QA
Overall Accuracy51
44
Theory of Mind reasoningMuMa-ToM
Accuracy47.63
40
Theory of Mind reasoningBigTOM (All)
Accuracy95.5
24
Mental State InferenceMMToM-QA human 1.0 (test)
Sub-score 1.1100
20
Theory of Mind reasoningBigTOM False Belief
Accuracy93.25
18
Theory of Mind reasoningToMI False Belief
Accuracy95.5
18
Theory of Mind reasoningMMToM-QA Text-only
Belief Inference 1.10.96
17
Showing 10 of 11 rows

Other info

Follow for update