Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

RAG Evaluation Datasets

Benchmarks

Task NameDataset NameSOTA ResultTrend
Membership Inference Attack DefenseRAG Evaluation Datasets (NQ, PubMedQA, TriviaQA)
Contextual Recall49.7
7
Poisoning DefenseRAG Evaluation Datasets NQ, PubMedQA, TriviaQA
Contextual Recall59.4
7
Showing 2 of 2 rows