Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

SciFact

Benchmarks

Task NameDataset NameSOTA ResultTrend
End-to-End Defense in RAGSciFact
ASR0
69
Information RetrievalSciFact (test)
NDCG@100.906
65
Information RetrievalSciFact
nDCG@1077.77
51
Complex reasoningSciFact (test)
Macro-F176.17
37
RAG Leakage AttackSCIFACT
CCL56.3
36
Information RetrievalSciFact BEIR (test)
nDCG@1076.6
36
Feature AttributionSciFact
Comprehensiveness74
33
Scientific Fact VerificationSciFact
Macro F183.03
25
Information RetrievalSciFact BEIR
NDCG@1099.36
24
Claim CorrectionSciFact Retrieved evidence
SARI Score41.7093
21
Information RetrievalSciFact
NDCG@10 (Dense)77.3
21
Information Retrievalscifact
Recall@10096.7
19
Information RetrievalSciFact
Faithfulness67
18
Document RetrievalSciFact BEIR
Delta nDCG@100.8
16
Information RetrievalSciFact
nDCG@100.77
16
Sentence SimilaritySciFact Sentence
Spearman Correlation0.359
15
Information RetrievalScifact
nDCG82
15
Fact-checkingSCIFact
Balanced Acc90.3
15
RerankingSciFact
nDCG76.4
12
Sentence-Level Confidence PredictionSciFact
AUROC0.91
12
Logical RetrievalSciFact BEIR v1 (test)
nDCG@100.64
12
Claim VerificationSCIFACT
Accuracy94.32
12
Claim CorrectionSciFact Gold evidence
SARI42.6331
11
RetrievalSciFact
Recall@100.952
10
RetrievalSciFact-G
R@1035.1
10
Showing 25 of 57 rows