Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Visual Social Reasoning on MoMentS
Loading...
70.68
Direct
GPT-4o
44.8256
51.5378
58.25
64.9622
Jul 27, 2025
Direct
Delta CoT
Delta CCoT
Delta COCOT
Updated 1mo ago
Evaluation Results
Method
Method
Links
Direct
Delta CoT
Delta CCoT
Delta COCOT
GPT-4o
Model Category=Standar...
2025.07
70.68
2.56
0.11
1.58
OpenAI o1-full
Model Category=Reasoni...
2025.07
67.6
3.41
1.1
1.55
Gemini-3.0-Flash
Model Category=Reasoni...
2025.07
64.8
2
3.7
6.51
Claude-3.5-Sonnet
Model Category=Standar...
2025.07
63.75
2.4
1.75
0.8
GPT-5.2
Model Category=Reasoni...
2025.07
62.95
0.52
0.63
0.4
OpenAI o3-mini
Model Category=Reasoni...
2025.07
56
3.06
0.5
2
Gemini-2.5-Pro
Model Category=Standar...
2025.07
55
2
0.5
5
Qwen2-VL-7B
Model Category=Open-So...
2025.07
49
11.91
8.13
7.8
LLaVA-OneVision-7B
Model Category=Open-So...
2025.07
45.82
2.39
0.32
0.4
Feedback
Search any
task
Search any
task