Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Open Question Answering on BGB (test)
Loading...
76.4
Factual Correctness (%)
Gemma 3 (12B)
35.216
45.908
56.6
67.292
Jan 20, 2026
Factual Correctness (%)
Updated 4d ago
Evaluation Results
Method
Method
Links
Factual Correctness (%)
Gemma 3 (12B)
Train Data=Difficulty-...
2026.01
76.4
GPT-5 (min. reasoning)
Train Data=Base Model
2026.01
74.7
GPT-5-mini (min. reasoning)
Train Data=Base Model
2026.01
62.7
LLaMA 3.1 (8B)
Train Data=Difficulty-...
2026.01
59.2
Gemma 3 (12B)
Train Data=Standard In...
2026.01
46.6
LLaMA 3.1 (8B)
Train Data=Base Model
2026.01
39.2
LLaMA 3.1 (8B)
Train Data=Standard In...
2026.01
37.6
Gemma 3 (12B)
Train Data=Base Model
2026.01
36.8
Feedback
Search any
task
Search any
task