Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Theory of Mind on FANToM
Loading...
95
Accuracy
DITTO
-3.592
22.004
47.6
73.196
May 19, 2026
Accuracy
Updated 13d ago
Evaluation Results
Method
Method
Links
Accuracy
DITTO
Backbone=Qwen3-VL-8B-I...
2026.05
95
GRPO
Backbone=Qwen3-VL-8B-I...
2026.05
94
GPT-5.4
2026.05
90
HumanLM-8B
2026.05
78
Qwen3-VL-8B-Instruct
Role=Base
2026.05
78
OSCToM-8B
Params=8B, Inference P...
2026.05
76
GPT-5-nano
2026.05
72
Llama-3.1-8B-Base
Params=8B, Inference P...
2026.05
66
Qwen2.5-14B
Params=14B, Inference...
2026.05
54.5
Phi-3-Medium-14B
Params=14B, Inference...
2026.05
51
Qwen2.5-32B
Params=32B, Inference...
2026.05
46.5
Gemma-2-27B
Params=27B, Inference...
2026.05
38.5
Mistral-NeMo-12B
Params=12B, Inference...
2026.05
37.5
ExploreToM
Params=8B, Inference P...
2026.05
0.2
Feedback
Search any
task
Search any
task