Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Multiple Choice Question Answering on BGB (test)

78.8Exact Accuracy

GPT-5 (min. reasoning)

63.40867.40471.475.396Jan 20, 2026
Updated 4d ago

Evaluation Results

MethodLinks
2026.01
78.8
2026.01
75.8
2026.01
75.1
2026.01
68.2
2026.01
67.1
2026.01
67
2026.01
64.2
2026.01
64