| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| Combined AIME'24 AIME'25 HMMT'25 BrUMO'25 | Bayes@1 | Kendall's tau_b (vs Gold Standard)0.865 | 1 | 1mo ago | |
| BrUMO'25 | Bayes_R0@1 | Kendall's tau_b (vs. Gold Standard)0.858 | 1 | 1mo ago | |
| AIME 25 | Bayes_R0@1 | Kendall's tau_b (vs Gold Standard)0.798 | 1 | 1mo ago | |
| AIME '24 | Bayes_R0@1 | Kendall's tau_b (vs. Gold)0.779 | 1 | 1mo ago |