Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

AI2 ARC

Benchmarks

Task NameDataset NameSOTA ResultTrend
Scientific ReasoningAI2-ARC Scientific
Recall79
32
Personalized InteractionAI2 ARC Synthetic
Personalization Score7.38
6
Showing 2 of 2 rows