Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Belief Prediction on EgoToM (test)
Loading...
33.8
True Accuracy
Qwen2.5-VL-7B
18.928
22.789
26.65
30.511
Mar 25, 2026
True Accuracy
Info Accuracy
True AND Info Accuracy
Updated 2mo ago
Evaluation Results
Method
Method
Links
True Accuracy
Info Accuracy
True AND Info Accuracy
Qwen2.5-VL-7B
Intervention Setting=+αΔ
2026.03
33.8
90.6
27.9
LLaVA-Next-Video-7B
Intervention Setting=+αΔ
2026.03
32.9
99.8
30.8
Qwen2.5-VL-7B
Intervention Setting=B...
2026.03
29
92.5
23.7
Gemini-2.5-Flash
Intervention Setting=B...
2026.03
28.1
99.7
28.1
LLaVA-Next-Video-7B
Intervention Setting=B...
2026.03
19.5
99.7
19.2
Feedback
Search any
task
Search any
task