Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Question Answering on HotPotQA (test) (Metrics: EM, F1)
Loading...
46.4
EM
INTRA
33.712
37.006
40.3
43.594
May 7, 2026
EM
F1
Updated 26d ago
Evaluation Results
Method
Method
Links
EM
F1
INTRA
Generator=T5Gemma2
2026.05
46.4
58
Hybrid RAG
Generator=T5Gemma2
2026.05
43.4
54.3
BGE
Generator=T5Gemma2
2026.05
41.9
53
Qwen3-Emb-4B + Reranker
Generator=T5Gemma2
2026.05
41.6
53.6
MaxSim
Generator=T5Gemma2
2026.05
40.7
52.2
BM25
Generator=T5Gemma2
2026.05
40.5
52
Qwen3-Emb-4B
Generator=T5Gemma2
2026.05
40.3
51.2
Qwen3-Emb-0.6B
Generator=T5Gemma2
2026.05
37
47.7
TF-IDF
Generator=T5Gemma2
2026.05
34.2
44.5
Feedback
Search any
task
Search any
task