Share your thoughts, 1 month free Claude Pro on usSee more

Temporal Question Answering on ReasonQA Multi-hop

85Set Accuracy

T5-large PIT-SFT

Updated 5mo ago

Evaluation Results

Method	Links
T5-large PIT-SFT 2023.11		85	89.5
T5-base PIT-SFT 2023.11		78	82.4
T5-large SFT 2023.11		71	76.4
T5-base SFT 2023.11		59.1	65.1
GPT-4 2023.11		51.6	65.4
FLAN-T5-XL 2023.11		35.5	49.7
GPT-3.5 2023.11		31.2	51.8