Share your thoughts, 1 month free Claude Pro on usSee more

Logical Reasoning on ARC Challenge (test)

95.7Accuracy

EAGLE3

Updated 4mo ago

Evaluation Results

Method	Links
EAGLE3 2026.02		95.7	1,822	164.2
Think 2026.02		95.6	1,812	156.5
NoThink* 2026.02		95.1	1,889	159.8
DEER 2026.02		94.6	1,011	200.3
SpecExit 2026.02		94.5	588	71.4
EAGLE3 2026.02		59.2	1,378	496.4
SpecExit 2026.02		50.3	500	253.7
Vanilla 2026.02		49.9	1,917	628.5
DEER 2026.02		47.5	1,029	531.3
NoThink 2026.02		12.6	135	13.6