| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Scientific Reasoning | SciWorld | Accuracy95.9 | 164 | |
| Science Simulation | SciWorld | Accuracy94.6 | 41 | |
| Embodied Agentic | SciWorld | Accuracy88.9 | 21 | |
| Next-state prediction | SciWorld (SW) | EM Accuracy98.64 | 16 | |
| Task success | SciWorld | Real68.21 | 14 | |
| Scientific Reasoning | SciWorld | Success Rate (SR)59.48 | 14 | |
| Science simulation | Sciworld | Progress Rate82.6 | 12 |