Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Long-context Question Answering on NarrativeQA Fixed Chunk 2048
Loading...
32.64
Score
Baseline
10.5192
16.2621
22.005
27.7479
Mar 5, 2026
Score
Updated 1mo ago
Evaluation Results
Method
Method
Links
Score
Baseline
Model=ChatGLM, Setting...
2026.03
32.64
Our
Model=ChatGLM, Setting...
2026.03
32.39
Our + Reorder
Model=ChatGLM, Setting...
2026.03
31.4
CacheBlend
Model=ChatGLM, Setting...
2026.03
29.7
EPIC (15%)
Model=ChatGLM, Setting...
2026.03
29.62
Our + Reorder
Model=LLaMA, Setting=F...
2026.03
29.57
Our
Model=LLaMA, Setting=F...
2026.03
28.91
No Recompute
Model=ChatGLM, Setting...
2026.03
27.58
EPIC (15%)
Model=LLaMA, Setting=F...
2026.03
27.01
CacheBlend
Model=LLaMA, Setting=F...
2026.03
26.85
No Recompute
Model=LLaMA, Setting=F...
2026.03
26.39
Our + Reorder
Model=Qwen, Setting=Fi...
2026.03
22.51
CacheBlend
Model=Qwen, Setting=Fi...
2026.03
21.7
Our
Model=Qwen, Setting=Fi...
2026.03
21.1
EPIC (15%)
Model=Qwen, Setting=Fi...
2026.03
19.99
Baseline
Model=Qwen, Setting=Fi...
2026.03
16.54
Baseline
Model=LLaMA, Setting=F...
2026.03
16.23
No Recompute
Model=Qwen, Setting=Fi...
2026.03
11.37
Feedback
Search any
task
Search any
task