Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Temporal Reasoning on TimeQuestions (test)
Loading...
56.8
EM
EXAQT
10.832
22.766
34.7
46.634
May 23, 2023
EM
F1
Updated 1mo ago
Evaluation Results
Method
Method
Links
EM
F1
EXAQT
Evaluation Protocol=Fi...
2023.05
56.8
-
QAaP
Backbone=gpt-3.5-turbo...
2023.05
36.8
46.7
CoT
Backbone=gpt-3.5-turbo...
2023.05
28.2
41.2
ReAct
Backbone=gpt-3.5-turbo...
2023.05
12.6
17.5
Feedback
Search any
task
Search any
task