Share your thoughts, 1 month free Claude Pro on usSee more

Logical Reasoning on GeoShape BBH

90Accuracy

MemAPO

Updated 4mo ago

Evaluation Results

Method	Links
MemAPO 2026.03		90
MemAPO 2026.03		77
TextGrad 2026.03		76
ProTeGi 2026.03		71
CoT 2026.03		64
Step-Back 2026.03		61
Step-Back 2026.03		58
ProTeGi 2026.03		56
SPO 2026.03		54
RaR 2026.03		53
CoT 2026.03		47
RaR 2026.03		43
PromptBreeder 2026.03		42
TextGrad 2026.03		39
OPRO 2026.03		36
IO 2026.03		35
OPRO 2026.03		35
SPO 2026.03		28
IO 2026.03		23
PromptBreeder 2026.03		23