Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Average Out-of-domain

Benchmarks

Task NameDataset NameSOTA ResultTrend
Multi-task ReasoningAverage Out-of-domain
Accuracy (OOD)49.57
24
Showing 1 of 1 rows