Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

ARC-AGI

Benchmarks

Task NameDataset NameSOTA ResultTrend
Abstract Visual ReasoningARC-AGI 1
Accuracy (Pass@2)98
27
Abstract Visual ReasoningARC-AGI 2
Accuracy (Pass@2)100
26
Abstraction and ReasoningARC-AGI 2 (public evaluation)
Pass@2100
13
Compositional ReasoningARC-AGI 2
Accuracy33.6
11
Abstraction and ReasoningARC-AGI Public Training Set (Easy) (60 tasks)
Total Cost0.41
10
ReasoningARC-AGI 2 (test)
Accuracy43.3
10
ReasoningARC-AGI 2
Accuracy50
9
Abstraction and ReasoningARC-AGI
ARC-1 Score58.2
9
ReasoningARC-AGI public evaluation set V2
Accuracy97.9
6
Symbolic ReasoningARC-AGI 1 (test)
Pass@247.5
6
ARC-AGIARC-AGI (test)
Accuracy (ARC-AGI Test)67
5
Abstract and compositional reasoningARC-AGI 2 (test)
Accuracy (ARC-AGI 2 Test)51
4
Puzzle SolvingARC-AGI 3
Levels Won2
3
Symbolic ReasoningARC-AGI 2 (test)
Pass@19.9
3
Abstract ReasoningARC-AGI (concept evaluation)
Accuracy86.8
2
Showing 15 of 15 rows