| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Logical Reasoning | LogiQA | LogiQA Accuracy78.9 | 181 | |
| Logical Reasoning | LogiQA (test) | Accuracy86 | 151 | |
| Logical Reasoning | LogiQA | Accuracy80.4 | 100 | |
| Logical Reasoning | LogiQA | Accuracy50.23 | 98 | |
| Logical Reasoning | LogiQA (val) | Accuracy58.37 | 50 | |
| Logical Reasoning | LogiQA (dev) | Accuracy47.3 | 40 | |
| Logical Reasoning | LogiQA-2 | Accuracy83.8 | 34 | |
| Logical Reasoning | LogiQA original (test) | Accuracy43.16 | 22 | |
| Commonsense Reasoning | LogiQA | Accuracy29.8 | 21 | |
| Logical Reasoning | LogiQA | Acc@t146.6 | 20 | |
| Logical Reasoning | LogiQA | Pass@1 Accuracy0.88 | 18 | |
| Correctness Prediction | LogiQA | Accuracy67.75 | 18 | |
| Question Answering | LogiQA | Accuracy44.29 | 17 | |
| Logical Reasoning | LogiQA | Pass@1 Accuracy48.61 | 14 | |
| Question Answering | LogiQA (test) | Accuracy85.75 | 12 | |
| Logical Reasoning | LogiQA | Accuracy74.1 | 11 | |
| Logical Reasoning | LogiQA 1.0 (test) | Accuracy86 | 11 | |
| Logical Reasoning | LogiQA Chinese | Pass@1 Accuracy52.4 | 10 | |
| Logical Reasoning | LogiQA English | Pass@1 Accuracy53 | 10 | |
| True/False Reasoning | LogiQA 2.0 (test) | Accuracy0.614 | 8 | |
| Logical Reasoning | LOGIQA | Hit@1 (LOGIQA)56.4 | 7 | |
| Downstream Task | LogiQA | Accuracy22.12 | 7 | |
| Logical Reasoning | LogiQA | Selection Accuracy43.57 | 6 | |
| Logical reasoning multi-choice QA | LogiQA v2 (test) | Macro F1 Score55.5 | 6 | |
| Logical Reasoning | LogiQA 1.0 (val) | Accuracy42.24 | 6 |