Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

LLMs as ASP Programmers: Self-Correction Enables Task-Agnostic Nonmonotonic Reasoning

About

Recent large language models (LLMs) have achieved impressive reasoning milestones but continue to struggle with high computational costs, logical inconsistencies, and sharp performance degradation on high-complexity problems. While neuro-symbolic methods attempt to mitigate these issues by coupling LLMs with symbolic reasoners, existing approaches typically rely on monotonic logics (e.g., SMT) that cannot represent defeasible reasoning -- essential components of human cognition. We present "LLM+ASP," a framework that translates natural language into Answer Set Programming (ASP), a nonmonotonic formalism based on stable model semantics. Unlike prior "LLM+ASP" approaches that require manually authored knowledge modules, domain-specific prompts, or evaluation restricted to single problem classes, our framework operates without any per-task engineering and applies uniformly across diverse reasoning tasks. Our system utilizes an automated self-correction loop where structured feedback from the ASP solver enables iterative refinement. Evaluating across six diverse benchmarks, we demonstrate that: (1) stable model semantics allow LLMs to naturally express default rules and exceptions, outperforming SMT-based alternatives by significant margins on nonmonotonic tasks; (2) iterative self-correction is the primary driver of performance, effectively replacing the need for handcrafted domain knowledge; (3) compact in-context reference guides substantially outperform verbose documentation, revealing a "context rot" phenomenon where excessive context hinders constraint adherence.

Adam Ishay, Joohyung Lee• 2026

Related benchmarks

TaskDatasetResultRank
General EvaluationAggregate Benchmarks
Average Score93.9
37
Logical reasoningBOARD (BOARDGAMEQA)
Accuracy96.3
15
Constraint SatisfactionZL-XL
Accuracy97.3
10
Constraint SatisfactionZL-XXL
Accuracy (%)97.7
10
Constraint SatisfactionSudokuBench
Accuracy74.7
10
Nonmonotonic reasoningMultiLogicNMR
Skeptical Accuracy100
10
PlanningMystery Blocksworld
Accuracy98.3
10
Showing 7 of 7 rows

Other info

Follow for update