Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Overcoming Multi-step Complexity in Multimodal Theory-of-Mind Reasoning: A Scalable Bayesian Planner

About

Theory-of-Mind (ToM) enables humans to infer mental states-such as beliefs, desires, and intentions-forming the foundation of social cognition. However, existing computational ToM methods rely on structured workflows with ToM-specific priors or deep model fine-tuning, which struggle with scalability in multimodal environments and fail to generalize as task complexity increases. To address these limitations, we propose a scalable Bayesian ToM planner that decomposes ToM reasoning into stepwise Bayesian updates. Our framework introduces weak-to-strong control, allowing smaller language models (LMs) to specialize in ToM-specific likelihood estimation and transfer their reasoning behaviors to larger LMs (7B to 405B) for integration with social and world knowledge. This synergistic approach aligns large-model inference of human mental states with Bayesian principles. Extensive experiments show that our method achieves a 4.6% accuracy improvement over state-of-the-art techniques on multimodal ToM benchmarks, including challenging unseen scenarios, thereby establishing a new standard for modeling human mental states in complex environments.

Chunhui Zhang, Zhongyu Ouyang, Kwonjoon Lee, Nakul Agarwal, Sean Dae Houlihan, Soroush Vosoughi, Shao-Yuan Lo• 2025

Related benchmarks

TaskDatasetResultRank
Theory of Mind reasoningMMToM-QA Text-only
Belief Inference 1.10.901
17
Theory of Mind reasoningMMToM-QA Multimodal
Belief Inference 1.192.1
14
Theory of Mind reasoningMMToM-QA Video-only--
13
Social interaction reasoningMuMa-ToM
Belief Score94
11
Theory of Mind reasoningapartment seen
Belief Inference Accuracy87
6
Theory of Mind reasoningAndersen tales
Belief Inference Accuracy85.8
6
Theory of Mind reasoningancient Egyptian
Belief Inference Accuracy86
6
Theory of Mind reasoningouter space
Belief Inference Accuracy87.2
6
Theory of Mind reasoningwild west
Belief Inference Acc85.3
6
Theory of Mind reasoningmedieval castle
Belief Inference Accuracy85.6
6
Showing 10 of 10 rows

Other info

Follow for update