Share your thoughts, 1 month free Claude Pro on usSee more

Linguistic Reasoning on BigBench Hard Snarks

0.554Accuracy

TextGrad

Updated 2mo ago

Evaluation Results

Method	Links
TextGrad 2026.05		0.554
ReElicit 2026.05		0.551
PromptBreeder 2026.05		0.539
OPRO 2026.05		0.537
APE 2026.05		0.527