Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

ARC-AGI

Benchmarks

Task NameDataset NameSOTA ResultTrend
Abstract Visual ReasoningARC-AGI 2
Accuracy (Pass@2)100
33
ReasoningARC-AGI 2 (test)
Pass@2 Exact Match Accuracy24.9
28
Abstract Visual ReasoningARC-AGI 1
Accuracy (Pass@2)98
27
Abstract reasoningARC-AGI 2
Pass@229.4
16
Abstraction and ReasoningARC-AGI 2 (public evaluation)
Pass@2100
13
Abstract ReasoningARC-AGI v1 (test)
Accuracy98
12
Abstract ReasoningARC-AGI v2 (test)
Accuracy100
11
Compositional ReasoningARC-AGI 2
Accuracy33.6
11
Abstraction and ReasoningARC-AGI Public Training Set (Easy) (60 tasks)
Total Cost0.41
10
ReasoningARC-AGI 2
Accuracy50
9
Abstraction and ReasoningARC-AGI
ARC-1 Score58.2
9
Fingerprint MatchingARC-AGI 1
FMR51
7
ReasoningARC-AGI public evaluation set V2
Accuracy97.9
6
Symbolic ReasoningARC-AGI 1 (test)
Pass@247.5
6
ARC-AGIARC-AGI (test)
Accuracy (ARC-AGI Test)67
5
Abstract ReasoningARC-AGI-3 25 Public Games v2
RHAE0
4
Abstract ReasoningARC-AGI
Frugality Index (Fp)3.54
4
Abstract and compositional reasoningARC-AGI 2 (test)
Accuracy (ARC-AGI 2 Test)51
4
Puzzle SolvingARC-AGI 3
Levels Won2
3
Symbolic ReasoningARC-AGI 2 (test)
Pass@19.9
3
ExplorationARC-AGI-3
TU93 Level4
2
Abstract ReasoningARC-AGI (concept evaluation)
Accuracy86.8
2
Showing 22 of 22 rows