Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Temporal Spatial Reasoning on TempCompass
Loading...
76.8
Average Accuracy
G2F-RAG
58.08
62.94
67.8
72.66
Apr 6, 2026
Average Accuracy
Updated 12d ago
Evaluation Results
Method
Method
Links
Average Accuracy
G2F-RAG
Size=14B, Base Model=I...
2026.04
76.8
G2F-RAG
Size=8B, Base Model=In...
2026.04
75.5
G2F-RAG
Size=7B, Base Model=Qw...
2026.04
72
InternVL3.5
Size=14B
2026.04
71.2
InternVL3.5
Size=8B
2026.04
71
InternVL3
Size=8B
2026.04
70.4
GPT-4o (2024-05-13)
Size=-
2026.04
69.5
Qwen2.5-VL
Size=7B
2026.04
69.2
G2F-RAG
Size=7B, Base Model=LL...
2026.04
69
G2F-RAG
Size=4B, Base Model=In...
2026.04
68.7
InternVL2.5
Size=8B
2026.04
68.7
Gemini1.5 Pro
Size=-
2026.04
68.4
LLaVA-Video
Size=7B
2026.04
65.5
InternVL3.5
Size=4B
2026.04
65.4
Qwen2.5-VL
Size=3B
2026.04
64.4
MiniCPM-V 2.6
Size=8B
2026.04
59.6
VILA-1.5
Size=8B
2026.04
58.8
Feedback
Search any
task
Search any
task