Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Video Spatial Reasoning on STI-Bench

41.4Accuracy

Gemini-2.5-pro

28.50431.85235.238.548Oct 10, 2025
Updated 7d ago

Evaluation Results

MethodLinks
41.4
2025.10
40.7
2025.10
39.3
2025.10
39.2
2025.10
38.2
2025.10
37
2025.10
35.9
2025.10
35
2025.10
33.2
2025.10
32.1
2025.10
31.5
2025.10
30.5
2025.10
29.9
2025.10
29.3
2025.10
29