| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Logical Reasoning | LogiQA | Accuracy50.23 | 98 | |
| Logical Reasoning | LogiQA (test) | Accuracy86 | 92 | |
| Logical Reasoning | LogiQA | Accuracy80.4 | 84 | |
| Logical Reasoning | LogiQA | LogiQA Accuracy78.9 | 56 | |
| Logical Reasoning | LogiQA (val) | Accuracy58.37 | 50 | |
| Logical Reasoning | LogiQA (dev) | Accuracy47.3 | 40 | |
| Logical Reasoning | LogiQA-2 | Accuracy83.8 | 30 | |
| Logical Reasoning | LogiQA original (test) | Accuracy43.16 | 22 | |
| Logical Reasoning | LogiQA | Acc@t146.6 | 20 | |
| Logical Reasoning | LogiQA | Pass@1 Accuracy0.88 | 18 | |
| Correctness Prediction | LogiQA | Accuracy67.75 | 18 | |
| Logical Reasoning | LogiQA | Pass@1 Accuracy48.61 | 14 | |
| Question Answering | LogiQA (test) | Accuracy85.75 | 12 | |
| Logical Reasoning | LogiQA | Accuracy74.1 | 11 | |
| Logical Reasoning | LogiQA 1.0 (test) | Accuracy86 | 11 | |
| True/False Reasoning | LogiQA 2.0 (test) | Accuracy0.614 | 8 | |
| Logical Reasoning | LOGIQA | Hit@1 (LOGIQA)56.4 | 7 | |
| Downstream Task | LogiQA | Accuracy22.12 | 7 | |
| Logical Reasoning | LogiQA | Selection Accuracy43.57 | 6 | |
| Logical reasoning multi-choice QA | LogiQA v2 (test) | Macro F1 Score55.5 | 6 | |
| Logical Reasoning | LogiQA 1.0 (val) | Accuracy42.24 | 6 | |
| Logical Reasoning | LogiQA v1 (dev) | Accuracy49.6 | 4 | |
| Logical Reasoning | LogiQA | Normalized Accuracy30.11 | 2 | |
| Question Answering | LogiQA | Accuracy23 | 2 |