Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

MadLibs

Benchmarks

Task NameDataset NameSOTA ResultTrend
Pair's RelationshipMadLibs Filtered Hard (test)
Accuracy62.06
7
Pair's RelationshipMadLibs Hard (test)
Accuracy56.17
7
Pair's RelationshipMadLibs Easy (test)
Accuracy78.5
7
Person's ActivityMadLibs Filtered Hard (test)
Accuracy74.45
7
Person's ActivityMadLibs Hard (test)
Accuracy71.13
7
Person's ActivityMadLibs Easy (test)
Accuracy87.57
7
Showing 6 of 6 rows