| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Multi-hop Question Answering | Legal | F1 Score71.93 | 14 | |
| Retrieval | Legal | Legal Score51.16 | 10 | |
| Misaligned Task Learning | Legal In-domain | Misalignment0.87 | 6 | |
| Emergent Misalignment Measurement | Legal | Misalignment0.58 | 6 | |
| Grammar Checking | Legal (in-house) | Precision95.2 | 5 | |
| Private Information Tagging | Legal (test) | Precision78.72 | 4 | |
| Cross-domain generalization | Legal (test) | Accuracy100 | 3 | |
| Legal Prediction | Legal | BS0.228 | 3 |