| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Open-domain Dialogue Evaluation | Free run Mechanical Turk 1 (initial data collection run) | Overall Score0.534 | 10 | |
| Open-domain Dialogue Evaluation | Free Run 2 1.0 (secondary data collection run) | Overall Quality Score0.455 | 10 |