DollyEval

Benchmarks

Task Name	Dataset Name	SOTA Result
Instruction Following	DollyEval	Rouge-L32.5	114
Instruction following	DollyEval 500 (test)	Rouge-L31.3	24
Dialogue Generation	DollyEval	ROUGE-L24.19	16
Generation	DollyEval (test)	LLM-as-a-Judge Score62.02	2
Instruction Following	DollyEval (test)	Score-	0

Showing 5 of 5 rows