Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Temporal Reasoning on TimeQuestions (test)
Loading...
56.8
EM
EXAQT
10.832
22.766
34.7
46.634
May 23, 2023
EM
F1
Updated 4d ago
Evaluation Results
Method
Method
Links
EM
F1
EXAQT
Evaluation Protocol=Fi...
2023.05
56.8
-
QAaP
Backbone=gpt-3.5-turbo...
2023.05
36.8
46.7
CoT
Backbone=gpt-3.5-turbo...
2023.05
28.2
41.2
ReAct
Backbone=gpt-3.5-turbo...
2023.05
12.6
17.5
Feedback
Search any
task
Search any
task