| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Abstract Reasoning | PGM Triple Pairs (test) | Accuracy45.2 | 5 | |
| Abstract Reasoning | PGM Triple Pairs (val) | Accuracy68.6 | 5 | |
| Abstract Reasoning | PGM Attribute Pairs (test) | Accuracy36.8 | 5 | |
| Abstract Reasoning | PGM Attribute Pairs (val) | Accuracy70.1 | 5 | |
| Abstract Reasoning | PGM Neutral (test) | Accuracy77 | 5 | |
| Abstract Reasoning | PGM Neutral (val) | Accuracy77.3 | 5 | |
| Abstract Reasoning | PGM Extrapolation (test) | Accuracy19.4 | 4 | |
| Abstract Reasoning | PGM Extrapolation (val) | Accuracy92.2 | 4 | |
| Abstract Reasoning | PGM Interpolation (test) | Accuracy70.5 | 4 | |
| Abstract Reasoning | PGM Interpolation (val) | Accuracy89.9 | 4 |