Share your thoughts, 1 month free Claude Pro on usSee more

Theory of Mind reasoning on ToMI (All)

87.8Accuracy

gpt-4

Updated 5mo ago

Evaluation Results

Method	Links
gpt-4 2023.11		87.8
gpt-4 2023.11		74.4
gpt-3.5-turbo 2023.11		72.8
gpt-3.5-turbo 2023.11		68.6
gpt-4 2023.11		66.5
gpt-3.5-turbo 2023.11		64.1
Llama2-13b-chat 2023.11		61.1
Llama2-13b-chat 2023.11		51
Llama2-7b-chat 2023.11		48.1
Llama2-13b-chat 2023.11		45
Llama2-7b-chat 2023.11		44.5
Llama2-7b-chat 2023.11		43.7