Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Long-context evaluation on LB v2 (ALL)
Loading...
38
Accuracy (ALL)
RAG
29.264
31.532
33.8
36.068
Jan 26, 2026
Accuracy (ALL)
Updated 1mo ago
Evaluation Results
Method
Method
Links
Accuracy (ALL)
RAG
Context Length=32k
2026.01
38
RR+Judge(y)
Context Type=Full Cont...
2026.01
32.6
Base
Context Type=Full Cont...
2026.01
32
RID+Q(y)
Context Type=Full Cont...
2026.01
32
RID+Q(y)
Context Type=KV-Cache...
2026.01
32
RID+C(y)
Context Type=Full Cont...
2026.01
31.4
RAO(y)
Context Type=Full Cont...
2026.01
31.2
RR+Judge(y)
Context Type=KV-Cache...
2026.01
31.2
RID(y)
Context Type=Full Cont...
2026.01
31
Base
Context Type=KV-Cache...
2026.01
30.2
RID(y)
Context Type=KV-Cache...
2026.01
30
RAO(y)
Context Type=KV-Cache...
2026.01
29.6
RID+C(y)
Context Type=KV-Cache...
2026.01
29.6
Feedback
Search any
task
Search any
task