| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Longitudinal Classification | AnnoMI (80/20) | Macro F1 Score52.6 | 24 | |
| Motivational Interviewing Behavior Evaluation | AnnoMI high-quality subset | R/Q1.28 | 12 | |
| Motivational Interviewing Global Score Prediction | AnnoMI (test) | Cultivate Score2.85 | 12 | |
| Dialogue Strategy Alignment | AnnoMI (full) | MI-i3.4 | 11 | |
| Exploration Focus | AnnoMI | Exploration Focus2.81 | 10 | |
| Motivational Interviewing Dialogue Generation | AnnoMI (test) | Client Readability4.45 | 5 | |
| Client Simulation | AnnoMI (test) | PE9.01 | 5 | |
| Motivational Interviewing Response Generation | AnnoMI (test) | MI-i (%)2.1 | 5 | |
| Counselor performance expert evaluation | AnnoMI Expert Evaluation Subset | Cultivating Change Talk4.06 | 4 | |
| Client Simulation | AnnoMI 1.0 (test) | Personas4.72 | 4 | |
| Dialogue Strategy Alignment | AnnoMI v1 (test) | MI-i Score (%)5.5 | 4 | |
| Client Simulation Receptivity Consistency | AnnoMI | Avg Receptivity (Threshold 1.0)5 | 2 | |
| Motivational Interviewing response generation | AnnoMI (human evaluation) | RAND Score21 | 2 |