Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Multiple-choice QA on RQA-MC
Loading...
81
Accuracy
Decoding_r
6.12
25.56
45
64.44
May 24, 2024
Accuracy
Certifiable Accuracy
Benign Accuracy (BAcc)
Updated 17d ago
Evaluation Results
Method
Method
Links
Accuracy
Certifiable Accuracy
Benign Accuracy (BAcc)
Decoding_r
LLM=Mistral-I7B, Defen...
2024.05
81
71
-
Vanilla
LLM=Mistral-I7B, Defen...
2024.05
80
-
-
Vanilla
LLM=Llama2-C7B, Defens...
2024.05
79
-
-
Decoding_r
LLM=Llama2-C7B, Defens...
2024.05
78
69
-
No RAG
LLM=Llama2-C7B, Defens...
2024.05
21
-
-
No RAG
LLM=Mistral-I7B, Defen...
2024.05
9
-
-
No RAG
LLM=GPT-3.5, Retrieved...
2024.05
-
-
8
Vanilla
LLM=GPT-3.5, Retrieved...
2024.05
-
0
80.4
RobustRAG (Keyword)
LLM=GPT-3.5, Retrieved...
2024.05
-
69.6
76.4
Feedback
Search any
task
Search any
task