| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Robotic task planning and code generation | Basic | Success Rate (SR)100 | 18 | |
| Question Answering | basic (test) | IDK Score11.7 | 11 | |
| Node-level Regression Uncertainty Quantification | Basic | PICP100 | 9 | |
| Uncertainty Quantification | Basic Synthetic | Sharpness9.08 | 8 | |
| Concept Erasure | Basic (test) | Unsafe Rate0 | 6 | |
| ADS Testing | Basic | Execution Time (s)63.3 | 3 |