Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Commonsense Question Answering on CSQA (test)

0.953Accuracy

Human

0.551560.655780.760.86422Jun 6, 2019Apr 7, 2020Feb 8, 2021Dec 11, 2021Oct 14, 2022Aug 16, 2023Jun 18, 2024
Updated 3d ago

Evaluation Results

MethodLinks
2019.06
0.953--
2024.06
0.92541.65-
0.90172.06-
2023.05
0.898--
2023.05
0.894--
2023.05
0.889--
2024.06
0.88861.94-
2023.05
0.874--
2024.04
0.862-100
0.86152.01-
2023.05
0.861--
2024.04
0.854-100
2024.06
0.84762.43-
2024.06
0.83782.5-
2023.05
0.833--
2024.06
0.82062.49-
2023.05
0.809--
2023.05
0.807--
2023.05
0.803--
2023.05
0.8--
2023.05
0.796--
2023.05
0.795--
2023.05
0.791--
2023.05
0.784--
2023.05
0.782--
2023.05
0.781--
2023.05
0.773--
2023.07
0.773--
2023.07
0.77--
2023.05
0.768--
2023.07
0.768--
2023.07
0.767--
2023.05
0.765--
2023.05
0.765--
2023.07
0.765--
2023.07
0.763--
2023.05
0.761--
2023.05
0.756--
2023.05
0.754--
2024.04
0.75-82.7
2024.04
0.746--
2024.04
0.741--
2021.04
0.7401--
2021.04
0.7387--
2022.04
0.736--
2023.05
0.735--
2023.07
0.735--
2024.04
0.734--
2021.04
0.7303--
2024.04
0.729--
2021.04
0.7276--
2021.04
0.7268--
2021.04
0.7268--
2022.04
0.7267--
2020.04
0.726--
2023.05
0.726--
2024.04
0.726--
2023.05
0.725--
2024.04
0.724-80.3
2020.04
0.723--
2020.04
0.721--
2020.04
0.721--
2023.05
0.721--
2022.04
0.7188--
2020.04
0.718--
2020.04
0.716--
2021.04
0.7121--
2021.04
0.712--
2024.04
0.712--
2021.04
0.7112--
2021.04
0.7111--
2024.04
0.708--
0.7043--
0.702--
2021.04
0.7008--
2021.04
0.6988--
2024.04
0.697--
2021.04
0.6933--
2024.06
0.6855--
2021.04
0.6841--
2024.06
0.6802--
2021.10
0.6798--
2024.04
0.672--
2019.06
0.647--
2022.04
0.6345--
2023.05
0.625--
2022.04
0.6235--
2019.08
0.622--
2022.04
0.6134--
2021.10
0.611--
2022.04
0.609--
2019.06
0.602--
2024.06
0.6011--
2024.06
0.5945--
2022.04
0.5932--
2022.04
0.59--
2022.04
0.5891--
0.582--
2019.08
0.582--
0.567--
Showing 100 of 127 rows