| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Logical Reasoning | English | BPA98.8 | 24 | |
| Text-To-Speech | English (test) | WER0.0165 | 21 | |
| Dependency Parsing | English (en) (test) | LAS95.33 | 16 | |
| Unsupervised Constituency Parsing | English SPMRL (test) | S-F169.7 | 15 | |
| Implicit Discourse Relation classification | English (test) | Precision62 | 12 | |
| Morphological Alignment | English 300 MB Corpora | Morph. Score64.4 | 9 | |
| Bias-Penalized Accuracy Evaluation | English | Bias-Penalized Accuracy (BPA)98.78 | 9 | |
| Speech-to-Singing conversion | English (test) | LSD2.512 | 6 | |
| RST Parsing | English | Span Score88.2 | 6 | |
| Speech Intelligibility Assessment | English | Absolute Kendall's Tau0.768 | 5 | |
| Speaker Diarization | English | DER10.272 | 5 | |
| Simple Definition Generation | English (test) | BLEU15.05 | 5 | |
| Named Entity Recognition | English (test) | F1 Score91.05 | 5 | |
| Zero-shot Text-to-Speech | English Speech Emotion Prompt | WER0.0194 | 4 | |
| Handwriting Generation | English (test) | Content Score0.8552 | 4 | |
| Tokenization | English Reasoning | Average Tokens per Sample6,192.77 | 3 | |
| Tokenization | English General | Avg Tokens per Sample794.79 | 3 | |
| Language Modeling | English Tail (test) | Relative P95 RTF Reduction4.66 | 3 | |
| Language Modeling | English VA (test) | Relative P95 RTF Reduction-23.79 | 3 | |
| Vector Font Reconstruction | English EN (test) | Error5.2 | 3 | |
| Complex Definition Generation | English (test) | BLEU24.17 | 3 | |
| General Language Evaluation | English lm-evaluation-harness | AGIEval Acc (Norm)0.259 | 2 |