Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Agentic Person Search (Spatial Reasoning) on Track 2 Spatial
Loading...
100
TWS
Oracle
29.176
47.563
65.95
84.337
Apr 14, 2026
TWS
Top-1 Accuracy
Success Rate @ 5
Average Time
Updated 4d ago
Evaluation Results
Method
Method
Links
TWS
Top-1 Accuracy
Success Rate @ 5
Average Time
Oracle
Backbone=Oracle
2026.04
100
100
100
2.05
ARGOS Agent
Backbone=Cl. Sonnet 4
2026.04
38.3
76
38.4
6.71
ARGOS Agent
Backbone=GPT-5.2
2026.04
33.8
73.1
33.6
7.04
ARGOS Agent
Backbone=GPT-4o
2026.04
32.3
74.5
31.6
7.49
ARGOS Agent
Backbone=GPT-5-mini
2026.04
31.9
74.9
32.2
7.48
Feedback
Search any
task
Search any
task