Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

LLM Performance Estimation on ARC (test)

1.165MAE (%)

SparseEval

0.78683.339655.89258.44535Feb 8, 2026
Updated 4d ago

Evaluation Results

MethodLinks
2026.02
1.1650.917
2026.02
1.2270.91
2026.02
1.4040.902
2026.02
1.5810.883
2026.02
1.7780.863
2026.02
2.2740.787
2026.02
2.2890.868
2026.02
2.3750.866
2026.02
2.4130.873
2026.02
2.4480.852
2026.02
2.6120.761
2026.02
2.6460.854
2026.02
2.8160.862
2026.02
2.890.867
2026.02
2.9590.758
2026.02
3.4260.824
2026.02
3.6420.698
2026.02
4.0040.769
2026.02
5.3320.641
2026.02
10.620.578