Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Temporal Question Answering on ArchivalQA
Loading...
32.2
Accuracy
AdapTime
19.2
22.575
25.95
29.325
Apr 27, 2026
Accuracy
F1 Score
Updated 1mo ago
Evaluation Results
Method
Method
Links
Accuracy
F1 Score
AdapTime
mode=open-domain
2026.04
32.2
30.5
Step-back
mode=open-domain
2026.04
29.5
27.8
Self-refinement
mode=open-domain
2026.04
28.2
24.3
Deepseek-V3
mode=open-domain
2026.04
19.7
18.6
Feedback
Search any
task
Search any
task