Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Symbolic and Logical Reasoning on Big-Bench Hard (BBH)

88.1Exact Match Performance

PromptWizard

31.4246.13560.8575.565Nov 18, 2023Apr 2, 2024Aug 16, 2024Dec 30, 2024May 15, 2025Sep 28, 2025Feb 11, 2026
Updated 4d ago

Evaluation Results

MethodLinks
2026.02
88.1-
2026.02
86.2482.08
2026.02
86.1-
2026.02
85.3480.5
2026.02
84.1378.84
2026.02
82.6378.72
2026.02
82.6178.5
2026.02
81.6567.75
2026.02
80.86-
2026.02
78.65-
2026.02
75.03-
2023.11
69.04-
2023.11
55.38-
2023.11
51.08-
2023.11
50.18-
2023.11
50.01-
2023.11
47.84-
2023.11
45.93-
2023.11
44.68-
2023.11
42.8-
2023.11
38.47-
2023.11
33.6-