Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Dialogue-level evaluation

Benchmarks

Task NameDataset NameSOTA ResultTrend
Dialogue GenerationDialogue-level evaluation Ethical and Affective Alignment
Respectful Tone8.1739
9
Dialogue-level Guidance Quality EvaluationDialogue-level evaluation (N=54)
State Tracking3.3
6
Showing 2 of 2 rows