| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| Overall Average across five scenarios | GraphPlanner | Accuracy63.6 | 10 | 1mo ago | |
| Popular | GraphPlanner | Accuracy52 | 10 | 1mo ago | |
| World Knowledge (WK) | GraphPlanner | Accuracy38 | 10 | 1mo ago | |
| Commonsense Reasoning (CS) | GraphRouter | Accuracy82.7 | 10 | 1mo ago | |
| Code MBPP HumanEval | GraphPlanner | Accuracy76 | 10 | 1mo ago | |
| Math GSM8K MATH | GraphPlanner | Accuracy67 | 10 | 1mo ago |