Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Who Did What (WDW)

Benchmarks

Task NameDataset NameSOTA ResultTrend
Reading ComprehensionWho Did What (WDW) (test)
Accuracy71.7
5
Reading ComprehensionWho Did What (WDW) (dev)
Accuracy72.3
3
Showing 2 of 2 rows