Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Spatial-Temporal Reasoning on VSI
Loading...
58.8
Accuracy
ReMoT-4b-CoT
31.968
38.934
45.9
52.866
Feb 28, 2026
Accuracy
Updated 1mo ago
Evaluation Results
Method
Method
Links
Accuracy
ReMoT-4b-CoT
Model Size=4b, Thinkin...
2026.02
58.8
Qwen3-VL-30B-CoT
Model Size=30B, Thinki...
2026.02
56.1
Qwen3-VL-4B-CoT
Model Size=4B, Thinkin...
2026.02
55.2
GPT-5
2026.02
55
Gemini-2.5-Pro
2026.02
53.6
InternVL2.5-8B
Model Size=8B
2026.02
46.6
GPT-4o
2026.02
42.5
LLaVA-Next-7B
Model Size=7B
2026.02
35.6
Qwen2.5-VL-7B
Model Size=7B
2026.02
33
Feedback
Search any
task
Search any
task