| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Task Routing | Science | Cost ($)0.0276 | 15 | |
| Taxonomy Expansion | Science (SCI) SemEval-2016 Task 13 | Chi-Squared13.2 | 10 | |
| Science Reasoning | Science (out-of-distribution) | Accuracy65.12 | 6 | |
| Named Entity Recognition | Science | F1 Score56.3 | 5 | |
| Text-to-SQL | Science Benchmark | Execution Accuracy51.8 | 4 | |
| Task-Efficient Routing | Science Curated Task Benchmark 1.0 (test) | Average Cost0.0054 | 3 | |
| Taxonomy Expansion | Science | Prec@144.7 | 3 | |
| Named Entity Recognition | Science English | F1 Score62.29 | 2 |