Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

MultiHopRAG

Benchmarks

Task NameDataset NameSOTA ResultTrend
End-to-end Question AnsweringMultiHopRAG (test val)
Accuracy47.14
20
Multi-session Retrieval-Augmented GenerationMultihopRAG (test)
F1 Score64.4
12
Multi-hop ReasoningMultiHopRAG
EM89.6
11
Information RetrievalMultiHopRAG (test)
MRR@1063.58
11
Query-relevant ExtractionMultiHopRAG
F1 Score32
8
Main Content ExtractionMultiHopRAG
F1 Score87.4
8
Multi-hop ReasoningMultiHopRAG Average 1.0 (test)
Relevancy64.47
4
Multi-hop ReasoningMultiHopRAG Temporal 1.0 (test)
Relevancy39.38
4
Multi-hop ReasoningMultiHopRAG Comparison 1.0 (test)
Relevancy60.28
4
Multi-hop ReasoningMultiHopRAG Inference 1.0 (test)
Relevancy96.76
4
Showing 10 of 10 rows