| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Refusal Evaluation | CCP Sensitive | Reject Rate92.35 | 13 | |
| Component-Configurator-Problem | CCP Overall (test) | Solving Percentage89 | 3 | |
| Component-Configurator-Problem | CCP Hard split | Solving Percentage96 | 3 | |
| Component-Configurator-Problem | CCP (Easy) | Solving Percentage100 | 3 |