| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| SCALE MultiChallenge | BRAID | Accuracy65.1 | 81 | 4d ago | |
| BBH (val) | G2IS | Accuracy65.81 | 42 | 4d ago | |
| BBH | Accuracy85.93 | 40 | 4d ago | ||
| BIG-bench Hard | FLAN-T5 | Orig Score39.3 | 7 | 3d ago | |
| BBH | BBH Solution Rate67.4 | 6 | 4d ago | ||
| Natural Scenes Dataset (NSD) (test) | Neuro-Vision to Language | BLEU-165.41 | 2 | 4d ago | |
| BIG-bench Hard Orig QA | - | Original Metric Value- | 0 | 4d ago |