| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Brain network classification | ABCD-OCD | F1 Score70.62 | 30 | |
| Workflow Extraction | ABCD | Macro Score58.55 | 24 | |
| Classification | ABCD Anx | Accuracy58.16 | 15 | |
| Classification | ABCD-ADHD | Accuracy63.75 | 15 | |
| Brain network classification | ABCD-Anx | F1 Score62.15 | 15 | |
| Brain network classification | ABCD-ADHD | F1 Score65.5 | 15 | |
| Semantic Retrieval | ABCD v1 (test) | Avg Cost per Query0 | 10 | |
| Brain Network Classification | ABCD | AUROC96.2 | 9 | |
| Dialog Workflow Extraction and Evaluation | ABCD 1.0 (test) | Macro Score61.2 | 8 | |
| General intelligence classification | ABCD fMRI | Accuracy0.682 | 7 | |
| Intelligence Regression | ABCD | MSE0.932 | 6 | |
| Sex Classification | ABCD | Accuracy80.1 | 6 | |
| Action State Tracking | ABCD in-distribution (test) | B-Slot Acc85.2 | 6 | |
| Language Modeling | ABCD | Accuracy97.73 | 4 | |
| Action Selection Task (AST) | ABCD (out-of-distribution) | B-Slot Accuracy94.6 | 4 | |
| Conditional abstraction and aggregation | ABCD | Operational Cost0.07 | 3 | |
| Cascading Dialogue Success | ABCD (test) | Intent Accuracy90.5 | 3 | |
| Semantic Retrieval | ABCD (test) | Runtime (s)10.1 | 2 | |
| Conditional aggregation | ABCD (test) | Runtime (s)3.34 | 2 | |
| Action State Tracking | ABCD (test) | B-Slot Score0.936 | 2 |