Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Theory of Mind reasoning on ToMI (All)
Loading...
87.8
Accuracy
gpt-4
41.936
53.843
65.75
77.657
Nov 16, 2023
Accuracy
Updated 3mo ago
Evaluation Results
Method
Method
Links
Accuracy
gpt-4
Prompting strategy=SIMTOM
2023.11
87.8
gpt-4
Prompting strategy=0-s...
2023.11
74.4
gpt-3.5-turbo
Prompting strategy=SIMTOM
2023.11
72.8
gpt-3.5-turbo
Prompting strategy=0-Shot
2023.11
68.6
gpt-4
Prompting strategy=0-Shot
2023.11
66.5
gpt-3.5-turbo
Prompting strategy=0-s...
2023.11
64.1
Llama2-13b-chat
Prompting strategy=SIMTOM
2023.11
61.1
Llama2-13b-chat
Prompting strategy=0-Shot
2023.11
51
Llama2-7b-chat
Prompting strategy=SIMTOM
2023.11
48.1
Llama2-13b-chat
Prompting strategy=0-s...
2023.11
45
Llama2-7b-chat
Prompting strategy=0-Shot
2023.11
44.5
Llama2-7b-chat
Prompting strategy=0-s...
2023.11
43.7
Feedback
Search any
task
Search any
task