Lemur: Integrating Large Language Models in Automated Program Verification

About

The demonstrated code-understanding capability of LLMs raises the question of whether they can be used for automated program verification, a task that demands high-level abstract reasoning about program properties that is challenging for verification tools. We propose a general methodology to combine the power of LLMs and automated reasoners for automated program verification. We formally describe this methodology as a set of transition rules and prove its soundness. We instantiate the calculus as a sound automated verification procedure and demonstrate practical improvements on a set of synthetic and competition benchmarks.

Haoze Wu, Clark Barrett, Nina Narodytska• 2023

Related benchmarks

Task	Dataset	Result
Formal Verification	Code2Inv (#=133) (test)	Solved Tasks107	8
Formal Verification	SV-COMP (#=47)	Solved Tasks25	8
Loop invariant inference	Multi-loop benchmark (OOPSLA-13, SV-COMP, CLRS-Alg)	Solved Count (Only)0.00e+0	8
Loop invariant inference	Single-loop Benchmark OOPSLA-13 SV-COMP CLRS-Alg	Solved Count (Only)0.00e+0	8

Showing 4 of 4 rows

Other info

Follow for update

@wizwand_team Discord