| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Logical Reasoning | LogicBench | Accuracy80.4 | 28 | |
| Skill retrieval | LogicBench | Recall@131.4 | 11 | |
| Skill retrieval | LogicBench | nDCG@131.4 | 11 | |
| Constrained Decoding | LogicBench | Constraint Satisfaction98.5 | 7 | |
| Logical Reasoning | LogicBench SEM variant | Accuracy92.26 | 2 | |
| Logical Reasoning | LogicBench Original | Accuracy92.81 | 2 |