Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Abstract Reasoning on Private human-curated 177 ARC-style tasks (evaluation set)
Loading...
55.93
Pass@k
Mini-Arch
34.2876
39.9063
45.525
51.1437
Feb 4, 2026
Pass@k
Updated 1mo ago
Evaluation Results
Method
Method
Links
Pass@k
Mini-Arch
k (number of attempts)=5
2026.02
55.93
Mini-Arch
k (number of attempts)=4
2026.02
52.54
Mini-Arch
k (number of attempts)=3
2026.02
51.79
Mini-Arch
k (number of attempts)=2
2026.02
45.99
Mini-Arch
k (number of attempts)=1
2026.02
35.12
Feedback
Search any
task
Search any
task