| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Anomaly Detection | 34 benchmark datasets aggregate (test) | Cost per Example0.113 | 20 | |
| Base-to-new generalization | Average of 10 benchmark datasets | Base Accuracy81.31 | 10 | |
| Classification | 19 benchmark datasets | Max ALL-RRS Score81.5 | 7 | |
| End-to-end detection and explanation | 7 Benchmark Datasets | Metric- | 0 |