Coupling Large Language Models with Logic Programming for Robust and General Reasoning from Text
About
While large language models (LLMs), such as GPT-3, appear to be robust and general, their reasoning ability is not at a level to compete with the best models trained for specific natural language reasoning problems. In this study, we observe that a large language model can serve as a highly effective few-shot semantic parser. It can convert natural language sentences into a logical form that serves as input for answer set programs, a logic-based declarative knowledge representation formalism. The combination results in a robust and general system that can handle multiple question-answering tasks without requiring retraining for each new task. It only needs a few examples to guide the LLM's adaptation to a specific task, along with reusable ASP knowledge modules that can be applied to multiple tasks. We demonstrate that this method achieves state-of-the-art performance on several NLP benchmarks, including bAbI, StepGame, CLUTRR, and gSCAN. Additionally, it successfully tackles robot planning tasks that an LLM alone fails to solve.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Logical reasoning | Stepgame k=4 | Accuracy93.8 | 56 | |
| Logical reasoning | Stepgame k=10 | Accuracy88.1 | 56 | |
| Logical reasoning | Stepgame k=3 | Accuracy89.2 | 56 | |
| Logical reasoning | CLUTRR | Accuracy56.7 | 42 | |
| Logical reasoning | CLUTRR (test) | Accuracy67.3 | 35 |