Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Event-Event Temporal Reasoning (L3) on TEMPREASON 1.0 (test)
Loading...
8,110
EM
TempT5
-282.8
1,896.1
4,075
6,253.9
Jun 15, 2023
EM
F1
Delta F1
Updated 4d ago
Evaluation Results
Method
Method
Links
EM
F1
Delta F1
TempT5
Setting=ReasonQA
2023.06
8,110
86.1
3.1
T5-SFT
Setting=ReasonQA
2023.06
7,820
83
-
ChatGPT
Setting=ReasonQA
2023.06
4,950
52.3
-
FLAN-T5-L
Setting=ReasonQA
2023.06
3,630
47.5
-
TempT5
Setting=OBQA
2023.06
2,110
32.4
1.2
T5-SFT
Setting=OBQA
2023.06
1,970
31.2
-
ChatGPT
Setting=OBQA
2023.06
1,700
25.3
-
TempT5
Setting=CBQA
2023.06
1,230
25.4
0.1
T5-SFT
Setting=CBQA
2023.06
1,210
25.3
-
ChatGPT
Setting=CBQA
2023.06
1,200
21.8
-
FLAN-T5-L
Setting=OBQA
2023.06
810
19.2
-
FLAN-T5-L
Setting=CBQA
2023.06
40
10.5
-
Feedback
Search any
task
Search any
task