Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Question Answering on Pacific unambiguous (dev)
Loading...
70.95
Reward
SGP-Oracle
-25.978
-0.814
24.35
49.514
Dec 3, 2025
Reward
F1
Clarification Rate
Mutual Agreement
Updated 4d ago
Evaluation Results
Method
Method
Links
Reward
F1
Clarification Rate
Mutual Agreement
SGP-Oracle
base_model=Gemma-2-9b,...
2025.12
70.95
83.6
15.6
9.39
SGP
base_model=Gemma-2-9b,...
2025.12
65.7
79.52
20.56
9.3
Answer
base_model=Gemma-2-9b,...
2025.12
56.91
71.52
0
0
Prompted-COT
base_model=Gemma-2-9b,...
2025.12
52.07
68.43
4.86
7.17
Clarify
base_model=Gemma-2-9b,...
2025.12
46.9
68.85
100
0
Prompted
base_model=Gemma-2-9b,...
2025.12
42.44
69.51
4.82
19.15
Multi-answer (MA)
base_model=Gemma-2-9b,...
2025.12
-4.04
63.34
0
100
Clarify-and-MA
base_model=Gemma-2-9b,...
2025.12
-22.25
57.77
100
100
Feedback
Search any
task
Search any
task