| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Question Answering | en multifield | F1 Score44.29 | 21 | |
| Dialect Robustness | EN | Success Rate57 | 11 | |
| Text-to-Speech | EN | WER3.1 | 3 | |
| Function Invocation | EN Ver. (Dual) | Token Usage1,300.7 | 3 | |
| Function Invocation | EN Ver. (Single) | Invocation Accuracy0.9 | 3 |