Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Multi-Doc Question Answering on LongBench

51.6HotpotQA

GPT-3.5-Turbo-16k

24.24831.34938.4545.551Jan 13, 2024
Updated 1mo ago

Evaluation Results

MethodLinks
2024.01
51.637.726.928.7
2024.01
49.237.623.120.4
2024.01
46.438.325.219.3
2024.01
44.934.819.517.4
2024.01
43.734.82222.6
2024.01
38.13610.720.9
2024.01
37.134.417.918.6
2024.01
3730.317.115.3
2024.01
36.132.414.526.8
2024.01
31.520.69.719.5
2024.01
25.432.89.45.2
2024.01
25.320.89.819.3