Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Temporal Question Answering on ReasonQA Single-hop
Loading...
95.1
Set Accuracy
T5-large PIT-SFT
25.316
43.433
61.55
79.667
Nov 16, 2023
Set Accuracy
Answer F1
Updated 4d ago
Evaluation Results
Method
Method
Links
Set Accuracy
Answer F1
T5-large PIT-SFT
Backbone=T5-large, Tra...
2023.11
95.1
95.6
T5-base PIT-SFT
Backbone=T5-base, Trai...
2023.11
91.6
93.8
T5-large SFT
Backbone=T5-large, Tra...
2023.11
86
88.1
T5-base SFT
Backbone=T5-base, Trai...
2023.11
80.4
83.3
GPT-4
Model version=gpt4-061...
2023.11
67.1
80.2
FLAN-T5-XL
Parameters=3B, Few-sho...
2023.11
61.5
64.1
GPT-3.5
Model version=gpt3.5-t...
2023.11
28
45.3
Feedback
Search any
task
Search any
task