Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Question Answering on HotpotQA (Acc)
Loading...
43.56
Accuracy
Naive RAG
23.1136
28.4218
33.73
39.0382
Mar 26, 2026
Accuracy
Updated 22d ago
Evaluation Results
Method
Method
Links
Accuracy
Naive RAG
LLM=Llama-3.1-8B, Writ...
2026.03
43.56
RePlug
LLM=Llama-3.1-8B, Writ...
2026.03
43.12
Naive RAG
LLM=Llama-3.1-8B, Writ...
2026.03
42.33
RePlug
LLM=Llama-3.1-8B, Writ...
2026.03
42.12
Naive RAG
LLM=Gemma-3-12B, Write...
2026.03
41.99
RePlug
LLM=Gemma-3-12B, Write...
2026.03
41.97
Naive RAG
LLM=Gemma-3-12B, Write...
2026.03
41.13
RePlug
LLM=Gemma-3-12B, Write...
2026.03
41.13
FLARE
LLM=Llama-3.1-8B, Writ...
2026.03
32.12
FLARE
LLM=Llama-3.1-8B, Writ...
2026.03
31.25
Self-RAG
LLM=Llama-3.1-8B, Writ...
2026.03
31.04
FLARE
LLM=Gemma-3-12B, Write...
2026.03
30.18
FLARE
LLM=Gemma-3-12B, Write...
2026.03
29.82
Self-RAG
LLM=Llama-3.1-8B, Writ...
2026.03
29.33
Self-RAG
LLM=Gemma-3-12B, Write...
2026.03
28.73
Self-RAG
LLM=Gemma-3-12B, Write...
2026.03
27.49
No Retrieval
LLM=Gemma-3-12B
2026.03
24.28
No Retrieval
LLM=Llama-3.1-8B
2026.03
23.9
Feedback
Search any
task
Search any
task