| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| End-to-End Defense in RAG | SciFact | ASR0 | 69 | |
| Information Retrieval | SciFact (test) | NDCG@100.906 | 65 | |
| Information Retrieval | SciFact | nDCG@1077.77 | 51 | |
| Complex reasoning | SciFact (test) | Macro-F176.17 | 37 | |
| RAG Leakage Attack | SCIFACT | CCL56.3 | 36 | |
| Information Retrieval | SciFact BEIR (test) | nDCG@1076.6 | 36 | |
| Feature Attribution | SciFact | Comprehensiveness74 | 33 | |
| Scientific Fact Verification | SciFact | Macro F183.03 | 25 | |
| Information Retrieval | SciFact BEIR | NDCG@1099.36 | 24 | |
| Claim Correction | SciFact Retrieved evidence | SARI Score41.7093 | 21 | |
| Information Retrieval | SciFact | NDCG@10 (Dense)77.3 | 21 | |
| Information Retrieval | scifact | Recall@10096.7 | 19 | |
| Information Retrieval | SciFact | Faithfulness67 | 18 | |
| Document Retrieval | SciFact BEIR | Delta nDCG@100.8 | 16 | |
| Information Retrieval | SciFact | nDCG@100.77 | 16 | |
| Sentence Similarity | SciFact Sentence | Spearman Correlation0.359 | 15 | |
| Information Retrieval | Scifact | nDCG82 | 15 | |
| Fact-checking | SCIFact | Balanced Acc90.3 | 15 | |
| Reranking | SciFact | nDCG76.4 | 12 | |
| Sentence-Level Confidence Prediction | SciFact | AUROC0.91 | 12 | |
| Logical Retrieval | SciFact BEIR v1 (test) | nDCG@100.64 | 12 | |
| Claim Verification | SCIFACT | Accuracy94.32 | 12 | |
| Claim Correction | SciFact Gold evidence | SARI42.6331 | 11 | |
| Retrieval | SciFact | Recall@100.952 | 10 | |
| Retrieval | SciFact-G | R@1035.1 | 10 |