| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| String-level response similarity | RA-QA Global, Discriminative tasks | BERTScore0.9 | 8 | |
| String-level response similarity | RA-QA Multiple-choice, Discriminative tasks | BERTScore0.85 | 4 | |
| String-level response similarity | RA-QA Single-Verify, Discriminative tasks | BERTScore94 | 4 | |
| Discriminative tasks | RA-QA | Accuracy72 | 4 | |
| Regression tasks | RA-QA | MAE2.29 | 3 |