| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| ScholarIdeas-AI contribution rubrics | ScholarEval-GPT-4.1 | Coverage (mean)3.45 | 31 | 3mo ago | |
| ScholarIdeas (Evaluation) | Reference Inv.23.33 | 11 | 3mo ago | ||
| ScholarIdeas Ecology | ScholarEval Claude | Coverage2.9 | 11 | 3mo ago | |
| ScholarIdeas Biochemistry | ScholarEval Claude | Coverage2.64 | 11 | 3mo ago | |
| ScholarIdeas Neuroscience | ScholarEval GPT-4.1 | Coverage2.74 | 11 | 3mo ago | |
| D_point (randomly sampled 60 instances) | InnoEval | Rationality Win Rate0.8848 | 5 | 3mo ago | |
| 5-seed blinded subset (test) | Si baseline | Novelty6.24 | 4 | 19d ago | |
| ScholarIdeas-AI | ScholarEval Claude | Evidence76.6 | 2 | 3mo ago | |
| Expert User Study | ScholarEval | Citations2.66 | 2 | 3mo ago |