Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Question Answering on NarrativeQA Helmet benchmark
Loading...
49.5
F1 Score
MiA-Emb-8B + MiA-Gen-14B
34.212
38.181
42.15
46.119
Dec 19, 2025
F1 Score
EM
Updated 4d ago
Evaluation Results
Method
Method
Links
F1 Score
EM
MiA-Emb-8B + MiA-Gen-14B
+Summ=true, Tokens=13k
2025.12
49.5
29.8
MiA-Emb-8B + MiA-Gen-14B
+Summ=true, Tokens=4k
2025.12
48.7
28.9
GPT4o-2405
+Summ=false, Tokens=12...
2025.12
46.5
-
GPT4o-2408
+Summ=false, Tokens=12...
2025.12
43.1
-
Gemini-1.5-Pro
+Summ=false, Tokens=2M...
2025.12
42.8
-
MiA-Emb-8B
Generative Model=Qwen2...
2025.12
39.1
20.4
MiA-Emb-8B
Generative Model=GPT4o...
2025.12
38.9
21.9
MiA-Emb-8B
Generative Model=Qwen2...
2025.12
36.7
18.2
Qwen3-Emb-8B
Generative Model=Qwen2...
2025.12
34.8
17.7
Feedback
Search any
task
Search any
task