Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

AnnoMI

Benchmarks

Task NameDataset NameSOTA ResultTrend
Longitudinal ClassificationAnnoMI (80/20)
Macro F1 Score52.6
24
Motivational Interviewing Behavior EvaluationAnnoMI high-quality subset
R/Q1.28
12
Motivational Interviewing Global Score PredictionAnnoMI (test)
Cultivate Score2.85
12
Dialogue Strategy AlignmentAnnoMI (full)
MI-i3.4
11
Exploration FocusAnnoMI
Exploration Focus2.81
10
Motivational Interviewing Dialogue GenerationAnnoMI (test)
Client Readability4.45
5
Client SimulationAnnoMI (test)
PE9.01
5
Motivational Interviewing Response GenerationAnnoMI (test)
MI-i (%)2.1
5
Counselor performance expert evaluationAnnoMI Expert Evaluation Subset
Cultivating Change Talk4.06
4
Client SimulationAnnoMI 1.0 (test)
Personas4.72
4
Dialogue Strategy AlignmentAnnoMI v1 (test)
MI-i Score (%)5.5
4
Client Simulation Receptivity ConsistencyAnnoMI
Avg Receptivity (Threshold 1.0)5
2
Motivational Interviewing response generationAnnoMI (human evaluation)
RAND Score21
2
Showing 13 of 13 rows