Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Fact Consolidation on MAB FC-MH 262K context v3 (full)
Loading...
27
Accuracy (SubEM)
CAR (Chain-Aware Resolution)
1
7.75
14.5
21.25
May 31, 2026
Accuracy (SubEM)
Gap vs Ours Best
Updated 1d ago
Evaluation Results
Method
Method
Links
Accuracy (SubEM)
Gap vs Ours Best
CAR (Chain-Aware Resolution)
architecture=FC-MH CAR...
2026.05
27
-
GPT-4o
mode=long-context
2026.05
5
-22
HippoRAG-v2
retrieval=hippocampal...
2026.05
5
-22
GPT-4o-mini
mode=long-context FIFO
2026.05
5
-22
GPT-4.1-mini
mode=long-context
2026.05
5
-22
BM25
retrieval=simple lexic...
2026.05
3
-24
Claude-3.7-Sonnet
mode=long-context
2026.05
2
-25
Feedback
Search any
task
Search any
task