Towards Trustworthy Legal AI through LLM Agents and Formal Reasoning

About

Legal decisions should be logical and based on statutory laws. While large language models(LLMs) are good at understanding legal text, they cannot provide verifiable justifications. We present L4L, a solver-centric framework that enforces formal alignment between LLM-based legal reasoning and statutory laws. The framework integrates role-differentiated LLM agents with SMT-backed verification, combining the flexibility of natural language with the rigor of symbolic reasoning. Our approach operates in four stages: (1) Statute Knowledge Building, where LLMs autoformalize legal provisions into logical constraints and validate them through case-level testing; (2) Dual Fact-and-Statute Extraction, in which the prosecutor-and defense-aligned agents independently map case narratives to argument tuples; (3) Solver-Centric Adjudication, where SMT solvers check the legal admissibility and consistency of the arguments against the formalized statute knowledge; (4) Judicial Rendering, in which a judge agent integrates solver-validated reasoning with statutory interpretation and similar precedents to produce a legally grounded verdict. Experiments on public legal benchmarks show that L4L consistently outperforms baselines, while providing auditable justifications that enable trustworthy legal AI.

Linze Chen, Yufan Cai, Zhe Hou, Jin Song Dong• 2025

Related benchmarks

Task	Dataset	Result
Criminal Sentencing and Legal Validity Evaluation	LeCaRD v2	RMSE9.98	32
Criminal Sentencing, Legal Validity, and Suspect-Level Performance Evaluation	LEEC	RMSE20.95	28
Specific Provision Prediction	LeCaRD v2	Precision82.35	24
General Provision Prediction	LeCaRD v2	Precision64.05	24
Robustness Evaluation	Perturbation Dataset	Change Accuracy62.56	8

Showing 5 of 5 rows

Other info

Follow for update

@wizwand_team Discord