Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Custom Suite

Benchmarks

Task NameDataset NameSOTA ResultTrend
Program InductionCustom Suite Comp-I OOD
Accuracy91
9
Program InductionCustom Suite Comp-I ID
Accuracy100
9
Program InductionCustom Suite Shift-P OOD
Accuracy100
9
Program InductionCustom Suite Shift-P ID
Accuracy100
9
Program InductionCustom Suite Shift-L OOD
Accuracy99
9
Program InductionCustom Suite Shift-L ID
Accuracy100
9
Showing 6 of 6 rows