Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

TOM-SB

Benchmarks

Task NameDataset NameSOTA ResultTrend
Adversarial Theory of MindTOM-SB
Fooling Rate (Hard)42.4
9
Theory of MindTOM-SB
ToM Accuracy (Trajectory)72
3
Adversarial DefenseTOM-SB
Fooling % (Hard)11.5
3
Showing 3 of 3 rows