Logic-LM: Empowering Large Language Models with Symbolic Solvers for Faithful Logical Reasoning

About

Large Language Models (LLMs) have shown human-like reasoning abilities but still struggle with complex logical problems. This paper introduces a novel framework, Logic-LM, which integrates LLMs with symbolic solvers to improve logical problem-solving. Our method first utilizes LLMs to translate a natural language problem into a symbolic formulation. Afterward, a deterministic symbolic solver performs inference on the formulated problem. We also introduce a self-refinement module, which utilizes the symbolic solver's error messages to revise symbolic formalizations. We demonstrate Logic-LM's effectiveness on five logical reasoning datasets: ProofWriter, PrOntoQA, FOLIO, LogicalDeduction, and AR-LSAT. On average, Logic-LM achieves a significant performance boost of 39.2% over using LLM alone with standard prompting and 18.4% over LLM with chain-of-thought prompting. Our findings suggest that Logic-LM, by combining LLMs with symbolic logic, offers a promising avenue for faithful logical reasoning. Code and data are publicly available at https://github.com/teacherpeterpan/Logic-LLM.

Liangming Pan, Alon Albalak, Xinyi Wang, William Yang Wang• 2023

Related benchmarks

Task	Dataset	Result
Logical reasoning	FOLIO	Accuracy71.6	126
Logical reasoning	AR-LSAT	Accuracy43.04	60
Logical reasoning	FOLIO (test)	Accuracy73.83	58
Logical reasoning	ProofWriter (test)	Accuracy84.35	57
Logical reasoning	ProntoQA (test)	Accuracy90.5	57
Logical reasoning	ProofWriter	Accuracy64.7	44
Logical reasoning	AR-LSAT (test)	Accuracy62.1	24
Deductive Reasoning	ProofWriter	End-to-end Accuracy79.66	21
Logical reasoning	Deduction (test)	Accuracy99.5	20
Deductive Reasoning	LogicalDeduction	Accuracy88	17

Showing 10 of 25 rows

Other info

Follow for update

@wizwand_team Discord