| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| SI-Score size synthetic | RobustViT | R@166.5 | 31 | 1mo ago | |
| SI-Score rotation synthetic | RobustViT | R@158 | 31 | 1mo ago | |
| SI-Score location synthetic | RobustViT | R@148.3 | 31 | 1mo ago | |
| MultiRLVR | Master-RM | FPR (%)0.02 | 20 | 1mo ago | |
| MATH | AdvJudge-Zero | FPR (%)0 | 20 | 1mo ago | |
| GSM8K | Master-RM | FPR (%)0 | 20 | 1mo ago | |
| AIME | Master-RM | FPR0 | 20 | 1mo ago | |
| SA-1b photos | CIN | Identity Bit Accuracy100 | 9 | 1mo ago | |
| Meta AI images | CIN | Identity Bit Acc100 | 9 | 1mo ago | |
| Perturbation Dataset | L4L | Change Accuracy62.56 | 8 | 1mo ago | |
| LLMBar | Qwen3-30B-A3B-Thinking-2507 | Accuracy83.07 | 8 | 1mo ago | |
| BiasBench | Accuracy82.5 | 8 | 1mo ago | ||
| Lexical Variation (abbr.) | Mamba | Jensen-Shannon Divergence0.0476 | 8 | 1mo ago | |
| CIFAR-100-C | Deep ens. (LPBN) | mCE43.15 | 8 | 1mo ago | |
| CartPole A=9.5 (test) | +DR | Average Reward231.8 | 6 | 1mo ago | |
| CartPole A=9.0 (test) | +ESN-OA-PT | Average Reward830.9 | 6 | 1mo ago | |
| CartPole A=8.5 (test) | +ESN-OA-PT | Average Reward810.1 | 6 | 1mo ago | |
| CartPole A=8.0 (test) | +ESN-OA-PT | Average Reward1,000 | 6 | 1mo ago | |
| ImageNet Robustness Variants (IN-A, IN-R, IN-Sketch, IN-C) (test) | SiameseIM | IN-A Top-1 Acc43.8 | 5 | 1mo ago | |
| SA-V | Video Seal | Identity Bit Accuracy100 | 4 | 1mo ago | |
| MovieGen | Video Seal | Identity Bit Acc100 | 4 | 1mo ago | |
| Stress Tests | SAT | Quantity Stress Score58.1 | 4 | 1mo ago | |
| Lexical Variation typos | Mamba | Jensen-Shannon Divergence0.0761 | 4 | 1mo ago | |
| Lexical Variation synonym | Mamba | Jensen-Shannon Divergence0.013 | 4 | 1mo ago | |
| Lexical Variation spelling | Mamba | Jensen-Shannon Divergence0.0054 | 4 | 1mo ago |