Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Abstraction and Reasoning on ARC-AGI 2 (public evaluation)
Loading...
100
Pass@2
Human Panel
1.096
26.773
52.45
78.127
Apr 2, 2026
Pass@2
Updated 13d ago
Evaluation Results
Method
Method
Links
Pass@2
Human Panel
Category=Human
2026.04
100
CoreThink Meta-Classifier
Category=Neuro-Symboli...
2026.04
30.8
J. Berman
Category=Hybrid
2026.04
29.4
NVARC
Category=Hybrid
2026.04
27.6
Compositional Reasoner
Category=Neuro-Symbolic
2026.04
24.4
GPT-5-Pro
Category=LLM
2026.04
18.3
Grok-4 (Thinking)
Category=LLM
2026.04
16
Claude Opus 4 (16K)
Category=LLM
2026.04
8.6
o3 (High)
Category=LLM
2026.04
6.5
o4-mini (High)
Category=LLM
2026.04
6.1
Claude Sonnet 4 (16K)
Category=LLM
2026.04
5.9
o3-Pro (High)
Category=LLM
2026.04
4.9
Gemini 2.5 Pro (32K)
Category=LLM
2026.04
4.9
Feedback
Search any
task
Search any
task