Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Theory of Mind Reasoning on COMMON-TOM 1.0 (test)

80Total Accuracy

Human Performance

49.21657.20865.273.192Mar 4, 2024
Updated 3mo ago

Evaluation Results

MethodLinks
80858075
2024.03
7170.471.371.2
2024.03
6464.863.963.2
2024.03
63.465.562.562.1
2024.03
60.663.360.558
2024.03
5760.757.753
50.450.350.550.4