Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

DollyEval

Benchmarks

Task NameDataset NameSOTA ResultTrend
Instruction FollowingDollyEval
Rouge-L32.5
114
GenerationDollyEval (test)
LLM-as-a-Judge Score62.02
2
Instruction FollowingDollyEval (test)
Score-
0
Showing 3 of 3 rows