Share your thoughts, 1 month free Claude Pro on usSee more

Question Answering on LongBench HotpotQA (F1)

63.58F1 Score

CORE

Updated 1mo ago

Evaluation Results

Method	Links
CORE 2025.08		63.58	126
Full Context 2025.08		62.22	9,151
LongLLMLingua 2025.08		57.69	907
No Context 2025.08		51.13	0
Random 2026.02		28.4	-
CurvPrune 2026.02		27.8	-
BM25 2026.02		27.3	-
BM25+Tex 2026.02		27	-
Sel. Ctx 2026.02		20.1	-
LLMLingua 2026.02		18.8	-
Recency 2026.02		12.2	-
Head+Tail 2026.02		8.2	-