| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Logical reasoning | Logic | Accuracy68.07 | 16 | |
| Spatial and logical reasoning | Logic | Score66.89 | 6 | |
| Language Modeling | Logic (val) | Perplexity131.95 | 2 | |
| Logic | Logic Hard | Baseline Score34.63 | 1 | |
| Logic | Logic Medium | Baseline Score47.37 | 1 | |
| Logic | Logic Easy | Baseline Score0.425 | 1 |