| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| Average MBO, NHO, SPO, TMC | ReAct | Avg APD14.6 | 14 | 3mo ago | |
| TMC | PIEVO | Solution Quality93.25 | 14 | 3mo ago | |
| SPO | PIEVO | SQ (%)37.85 | 14 | 3mo ago | |
| NHO | PIEVO | Solution Quality (SQ)0.9636 | 14 | 3mo ago | |
| MBO | PIEVO | Solution Quality153.53 | 14 | 3mo ago | |
| AIME Agent | COAT | Step 1 Mean37.78 | 4 | 2mo ago | |
| Circle Packing | CausalPlanner (Meta) | Step 1 Mean Score2.348 | 4 | 2mo ago | |
| Second Autocorr. Inequality | CausalEvolve | Step 1 Mean Score0.781 | 4 | 2mo ago | |
| Hadamard Matrix | CausalPlanner (Meta) | Step 1 Mean Score55.6 | 4 | 2mo ago | |
| BaisBench Scientific Discovery (BAIS-SD) | Mean SSD75.9 | 3 | 26d ago | ||
| Erdos | Baseline | Rmax (Training)2.6174 | 2 | 21d ago | |
| CP26 | UG-TTT | Rmax (Training)2.6359 | 2 | 21d ago | |
| AC2 | UG-TTT | Rmax (Training)0.8563 | 2 | 21d ago | |
| AC1 | UG-TTT | Rmax during training0.6406 | 2 | 21d ago | |
| Scientific Discovery | CASTER | Param & Constraint Acc38.5 | 2 | 1d ago | |
| Scientific Workflow Execution and Categorical Provenance | - | - | 0 | 1d ago | |
| Protein-Mechanics Symbolic DAG World Models | - | - | 0 | 1d ago | |
| Scientific Research Ideation and Reports | - | - | 0 | 1d ago | |
| Material Knowledge Graphs | - | - | 0 | 1d ago | |
| ICAIS AI Scientist Track 2025 | - | - | 0 | 2mo ago |