| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Automated Essay Scoring | ASAP 1.0 (test) | Prompt 1 QWK0.836 | 51 | |
| Evaluation Alignment | ASAP 2.0 | QWK0.7276 | 16 | |
| Automatic Text Evaluation | ASAP | QWK0.379 | 15 | |
| Automated Essay Scoring | ASAP Kaggle 2.0 (test) | QWK0.84 | 13 | |
| Essay Scoring | ASAP++ five-fold averaged results | Overall Score0.712 | 10 | |
| Essay Scoring | ASAP-SAS | QWK (Prompt 3)0.661 | 10 | |
| Automated Essay Scoring | ASAP++ full-data setting | Score P10.734 | 10 | |
| Multi-trait Automated Essay Scoring | ASAP++ (full-data) | Overall Score0.781 | 10 | |
| Automatic Text Scoring | ASAP (test) | QWK0.764 | 9 | |
| Automatic Essay Scoring | ASAP In-domain (5-fold cross-validation) | Overall QWK0.785 | 8 | |
| Expressive Piano Performance Rendering | ASAP (test) | Velocity JS Div0.0427 | 7 | |
| Multi-trait automated essay scoring | ASAP Prompt 8 (test) | Ideas0.694 | 6 | |
| Multi-trait automated essay scoring | ASAP Prompt 7 (test) | Ideas Score69.5 | 6 | |
| Automated Essay Scoring | ASAP Long Essays (Prompts 1, 2, 8) | Score (P1)83.6 | 4 |