Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Who Did What

Benchmarks

Task NameDataset NameSOTA ResultTrend
Reading ComprehensionWho Did What Relaxed (WDW-R) (test)
Accuracy72.6
5
Reading ComprehensionWho Did What Relaxed (WDW-R) (dev)
Accuracy73.1
3
Showing 2 of 2 rows