Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Scientific Reasoning on GPQA diamond (ACC, TOK, LAT)

91.9Accuracy

Gemini 3-Pro

26.943.77560.6577.525Feb 2, 2026Feb 3, 2026Feb 4, 2026Feb 5, 2026Feb 6, 2026
Updated 3d ago

Evaluation Results

MethodLinks
91.9---
2026.02
91.9--8,000
2026.02
87.6--14,000
86.4---
2026.02
86.36---
85.7---
2026.02
84.5--13,000
82.4---
2026.02
82.4--7,000
2026.02
48.9911.89272.24-
2026.02
48.177.58156.09-
2026.02
47.0212.32579.22-
2026.02
45.658.1254.73-
2026.02
44.658.05166.54-
2026.02
37.524.27149.93-
2026.02
37.524.27149.93-
2026.02
35.468.6949.87-
2026.02
35.468.6949.87-
2026.02
32.3111.7774.31-
2026.02
32.3111.7774.31-
2026.02
29.8115.48197.33-
2026.02
29.8115.48197.33-
2026.02
29.410.45101.84-
2026.02
29.410.45101.84-