| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Mathematical Reasoning | FrontierScience-Olympiad | Accuracy50.8 | 63 | |
| Scientific Olympiad Reasoning | FrontierScience-Olympiad | Biology Accuracy43.5 | 30 | |
| Scientific Problem Solving | FrontierScience-Olympiad | Token Efficiency Ratio (B_method/BMV)5.59 | 27 | |
| Scientific olympiad problem solving | FrontierScience Olympiad | Accuracy80 | 12 | |
| LLM Self-Consistency Certification | FrontierScience Olympiad | Bonferroni Score53 | 10 | |
| Scientific problem solving | FrontierScience Olympiad N=20 (test) | Metric- | 0 |