Share your thoughts, 1 month free Claude Pro on usSee more

Long-context Question Answering on HotpotQA Fixed Chunk 2048

60.03QA Score

Baseline

Updated 2mo ago

Evaluation Results

Method	Links
Baseline 2026.03		60.03
Our 2026.03		59.67
Baseline 2026.03		59.22
Our + Reorder 2026.03		58.2
Our 2026.03		57.39
Baseline 2026.03		54.1
EPIC (15%) 2026.03		53.62
CacheBlend 2026.03		53.52
EPIC (15%) 2026.03		52.84
CacheBlend 2026.03		51.77
Our 2026.03		51.5
Our + Reorder 2026.03		50.53
Our + Reorder 2026.03		50.53
No Recompute 2026.03		50.24
EPIC (15%) 2026.03		47.55
CacheBlend 2026.03		47.2
No Recompute 2026.03		46.71
No Recompute 2026.03		46.33