Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Question Answering on HotpotQA (test) (Comprehensive Metrics)

0.822F1

HGN

0.1700240.3392870.508550.677813Jan 27, 2022Aug 10, 2022Feb 22, 2023Sep 5, 2023Mar 19, 2024Sep 30, 2024Apr 14, 2025
Updated 2mo ago

Evaluation Results

MethodLinks
2022.01
0.8220.692-------
2022.01
0.8160.687-------
2022.01
0.8120.68-------
2022.01
0.8120.674-------
2022.01
0.8080.677-------
2022.01
0.7970.665-------
2022.01
0.7920.648-------
2022.01
0.7340.591-------
2022.01
0.6930.557-------
2025.02
0.665-0.69220.667-----
2025.02
0.6476-0.67210.6509-----
2025.02
0.646-0.6720.6489-----
2025.02
0.6388-0.66340.6399-----
2025.02
0.6166-0.64140.6208-----
2025.02
0.6004-0.62320.6047-----
2025.04
0.25480.1630.26050.30522.5242.5531.263194.39512.273
2025.04
0.22360.1240.22610.30233.0243.9851.765200.39813.562
2025.04
0.19510.1110.19310.30032.9044.3862.094302.89419.146