Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Commonsense Question Answering on CosmosQA

94Accuracy

HUMAN

15.438435.834256.2376.6258Feb 5, 2020Feb 19, 2021Mar 7, 2022Mar 23, 2023Apr 6, 2024Apr 22, 2025May 8, 2026
Updated 23d ago

Evaluation Results

MethodLinks
2021.03
94
2021.03
91.8
2022.09
91.12
2026.05
91
2021.03
90.3
2026.05
90
2026.05
90
2022.09
88.61
2026.05
88
2026.05
88
2021.03
85.4
2026.05
85
2021.03
83.5
2022.09
83.25
2026.03
82.4
2022.10
82.3
2026.03
82.1
2026.05
82
2026.05
82
2026.05
82
2020.02
81.83
2022.09
81.7
2026.03
81.5
2020.02
81.19
2020.02
80.93
2020.02
80.76
2022.10
80.7
2022.09
80.63
2022.10
80.6
2020.02
80.59
2022.10
80.5
2026.03
80.4
2022.09
80.29
2026.05
80
2026.03
79.5
2026.03
78.5
2026.03
78.1
2022.09
77.55
2026.03
77.1
2022.09
76.94
2022.09
76.88
2022.09
76.47
2026.03
76
2026.05
76
2026.03
75.9
2022.09
75.8
2026.05
75
2026.03
74.7
2026.03
74.2
2026.03
74
2026.03
73.9
2026.03
72.4
2026.03
71.7
2026.03
71.1
2022.09
70.64
2026.03
70.6
2022.09
69.18
2020.02
68.7
2022.09
68.34
2022.09
64.82
2022.09
64.72
2022.09
64.32
2022.09
64.22
2026.05
51
2026.05
51
2023.05
46.16
2022.09
25.46
2022.09
18.46