Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Temporal Reasoning on TempQuestions (test)
Loading...
60.3
Exact Match (EM)
QAaP
26.708
35.429
44.15
52.871
Dec 31, 2022
Jan 23, 2023
Feb 16, 2023
Mar 12, 2023
Apr 5, 2023
Apr 29, 2023
May 23, 2023
Exact Match (EM)
F1 Score
Updated 4d ago
Evaluation Results
Method
Method
Links
Exact Match (EM)
F1 Score
QAaP
Backbone=gpt-3.5-turbo...
2023.05
60.3
68.1
CoT
Backbone=gpt-3.5-turbo...
2023.05
49.8
60.1
TEQUILA
Evaluation Protocol=Fi...
2023.05
43.8
44.6
Rethinking with retrieval
Model=GPT-3 (text-davi...
2022.12
39.05
-
Self-consistency
Model=GPT-3 (text-davi...
2022.12
37.28
-
Chain-of-thought prompting
Model=GPT-3 (text-davi...
2022.12
33.14
-
Few-shot prompting
Model=GPT-3 (text-davi...
2022.12
29.59
-
Zero-shot prompting
Model=GPT-3 (text-davi...
2022.12
28.4
-
ReAct
Backbone=gpt-3.5-turbo...
2023.05
28
33.7
Feedback
Search any
task
Search any
task