Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Question Answering on 24K Long-Context QA Tasks
Loading...
17.75
MultiFieldQA Score
SAC
14.2036
15.1243
16.045
16.9657
Oct 10, 2025
MultiFieldQA Score
NarrativeQA Score
Qasper Score
Single-Doc Avg. Score
2WikiMQA Score
MuSiQue Score
HotpotQA Score
Multi-Doc Avg. Score
Updated 1mo ago
Evaluation Results
Method
Method
Links
MultiFieldQA Score
NarrativeQA Score
Qasper Score
Single-Doc Avg. Score
2WikiMQA Score
MuSiQue Score
HotpotQA Score
Multi-Doc Avg. Score
SAC
Context Length=24K, Co...
2025.10
17.75
11.33
11.67
13.58
24.78
12.36
26.03
21.06
SAC
Method Variant=ae+lm,...
2025.10
17.38
10.49
9.05
12.31
22.47
7.76
23.88
18.04
EPL
Context Length=24K, Co...
2025.10
14.34
10.5
7.58
10.81
21.99
8.43
23.72
18.05
Feedback
Search any
task
Search any
task