Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Multi-Hop Fact-based Reasoning on MAB FC-MH, 262K v3 (test)
Loading...
27
Accuracy
Chain-Aware Resolution (CAR)
2.04
8.52
15
21.48
May 31, 2026
Accuracy
Updated 1d ago
Evaluation Results
Method
Method
Links
Accuracy
Chain-Aware Resolution (CAR)
Backbone=gpt-4o-mini,...
2026.05
27
GPT-4o (long-context)
Backbone=gpt-4o, Pipel...
2026.05
5
HippoRAG-v2 (best published)
Backbone=gpt-4o-mini
2026.05
5
BM25
Backbone=gpt-4o-mini,...
2026.05
3
Zep / Graphiti
Backbone=gpt-4o-mini,...
2026.05
3
Feedback
Search any
task
Search any
task