Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

MetaQA

Benchmarks

Task NameDataset NameSOTA ResultTrend
Question AnsweringMetaQA 3-hop
Hits@1100
47
Question AnsweringMetaQA 3-hop
Accuracy14.7
30
Question AnsweringMetaQA 2-hop
Accuracy63.17
30
Question AnsweringMetaQA 1-hop
Accuracy82.77
30
Question AnsweringMetaQA
1-hop EM@192.7
30
Knowledge Base Question AnsweringMetaQA 1hop
Hits@1100
28
Question AnsweringMetaQA 2-hop
Hits@1100
28
Knowledge Graph Question AnsweringMetaQA 2-hop (test)
Hits@1100
24
Knowledge Graph Question AnsweringMetaQA 3-hop
Accuracy89.6
16
Knowledge Graph Question AnsweringMetaQA 1-hop
Accuracy87.6
16
Multi-hop Knowledge Graph Question AnsweringMetaQA
Hit@1 (2-hop)100
11
Question AnsweringMetaQA 3-hop
Normalized Latency (vs Vanilla RAG)0.86
9
Question AnsweringMetaQA 2-hop
Normalized Average Latency0.83
9
Question AnsweringMetaQA 1-hop
Average Latency (Normalized)0.8
9
Question AnsweringMetaQA 3-hop
Average Edge Budget2.1
9
Question AnsweringMetaQA 2-hop
Average Edge Budget0.78
9
Question AnsweringMetaQA 1-hop
Hits@197.5
9
Knowledge Graph Question AnsweringMetaQA 2-hop 1.0 (test)
Accuracy94.4
9
Knowledge Graph Question AnsweringMetaQA 1-hop 1.0 (test)
Accuracy (%)97.5
9
Knowledge Graph Question AnsweringMetaQA 50% KG setting (test)
Hits@1 (1-hop)76
9
Hallucination DetectionMetaQA 1hop (Qwen2.5-7B)
AUC85.28
7
Hallucination DetectionMetaQA 1hop (LLaMA2-7B)
AUC83.41
7
Question AnsweringMetaQA 1-hop wikimovies
Hits@1 (KB)97.5
6
Question AnsweringMetaQA
HS95.5
4
Stealthiness evaluationMetaQA
Detected Samples8,321
3
Showing 25 of 27 rows