Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Open-ended generation on NQ (EM, F1, Truth)
Loading...
0.512
Exact Match (EM)
CoCoASIG
0.384808
0.417829
0.45085
0.483871
Feb 10, 2026
Exact Match (EM)
F1 Score
Truth (%)
Updated 4d ago
Evaluation Results
Method
Method
Links
Exact Match (EM)
F1 Score
Truth (%)
CoCoASIG
Model=Qwen-2.5-14B, Mo...
2026.02
0.512
0.7386
0.8899
Diver
Model=Qwen-2.5-14B
2026.02
0.4973
0.7184
0.882
Baseline
Model=Qwen-2.5-14B
2026.02
0.483
0.7109
0.8978
DoLa
Model=Qwen-2.5-14B
2026.02
0.3897
0.628
0.8126
Feedback
Search any
task
Search any
task