Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Agentic Person Search (Temporal Reasoning) on Track 3 Temporal
Loading...
100
TWS
Oracle
52.992
65.196
77.4
89.604
Apr 14, 2026
TWS
Top-1 Accuracy
SR@5
Avg Time
Updated 4d ago
Evaluation Results
Method
Method
Links
TWS
Top-1 Accuracy
SR@5
Avg Time
Oracle
Backbone=Oracle
2026.04
100
100
100
1.88
ARGOS Agent
Backbone=GPT-5.2
2026.04
59
88.2
65.8
3.91
ARGOS Agent
Backbone=GPT-4o
2026.04
56.7
80.6
60.8
3.91
ARGOS Agent
Backbone=GPT-5-mini
2026.04
55.6
88
60.5
4.4
ARGOS Agent
Backbone=Cl. Sonnet 4
2026.04
54.8
83.6
59.5
4.25
Feedback
Search any
task
Search any
task