Share your thoughts, 1 month free Claude Pro on usSee more

Open Question Answering on BGB (test)

76.4Factual Correctness (%)

Gemma 3 (12B)

Updated 4mo ago

Evaluation Results

Method	Links
Gemma 3 (12B) 2026.01		76.4
GPT-5 (min. reasoning) 2026.01		74.7
GPT-5-mini (min. reasoning) 2026.01		62.7
LLaMA 3.1 (8B) 2026.01		59.2
Gemma 3 (12B) 2026.01		46.6
LLaMA 3.1 (8B) 2026.01		39.2
LLaMA 3.1 (8B) 2026.01		37.6
Gemma 3 (12B) 2026.01		36.8