Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Temporal spatial reasoning on VSTI-Bench (test)

68.9Average Score

Cambrian-P

30.62840.56450.560.436Dec 29, 2025Jan 21, 2026Feb 14, 2026Mar 10, 2026Apr 3, 2026Apr 27, 2026May 21, 2026
Updated 12d ago

Evaluation Results

MethodLinks
2026.05
68.942.546.687.794.373.2
2026.05
67.438.445.884.293.675.2
2026.05
62.439.440.667.792.272
2026.05
58.839.439.660.686.568.6
2025.12
46.827.8548.274.252.3
2025.12
46.728.67.147.973.953
2025.12
4432.310.548.178.350.9
2026.05
4432.310.548.178.350.9
2025.12
43.532.913.5486855
2025.12
41.729.919.347.562.149.8
2025.12
4028.21.849.864.755.6
2025.12
38.229.523.437.358.142.5
2025.12
38.228.215.728.865.453
2026.05
38.229.523.437.358.142.5
2025.12
38.117.727.84354.947.2
2025.12
37.330.127.342.250.436.7
2025.12
36.916.532.446.150.539
2025.12
32.313.55.143.757.941.2
2025.12
32.128.520.924.452.633.9
32.128.520.924.452.633.9