Share your thoughts, 1 month free Claude Pro on usSee more

Linguistic Reasoning on BigBench Hard Hyperbaton

80.2Accuracy

PromptBreeder

Updated 2mo ago

Evaluation Results

Method	Links
PromptBreeder 2026.05		80.2
ReElicit 2026.05		79.6
TextGrad 2026.05		79.1
OPRO 2026.05		78.3
APE 2026.05		77