| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| Ten benchmark topics (100 generated research ideas) | EvoSci | Average Wins4.27 | 15 | 8d ago | |
| Expert Evaluation Biotech Domain (test) | DeepInnovator | Novelty Winrate vs Qwen-14B91.7 | 1 | 3mo ago | |
| Expert Evaluation Education Domain (test) | DeepInnovator | Novelty Winrate vs Qwen80 | 1 | 3mo ago | |
| Expert Evaluation Law Domain (test) | DeepInnovator | Novelty (Winrate vs Qwen)69.2 | 1 | 3mo ago |