Autoformalization with Large Language Models
About
Autoformalization is the process of automatically translating from natural language mathematics to formal specifications and proofs. A successful autoformalization system could advance the fields of formal verification, program synthesis, and artificial intelligence. While the long-term goal of autoformalization seemed elusive for a long time, we show large language models provide new prospects towards this goal. We make the surprising observation that LLMs can correctly translate a significant portion ($25.3\%$) of mathematical competition problems perfectly to formal specifications in Isabelle/HOL. We demonstrate the usefulness of this process by improving a previously introduced neural theorem prover via training on these autoformalized theorems. Our methodology results in a new state-of-the-art result on the MiniF2F theorem proving benchmark, improving the proof rate from $29.6\%$ to $35.2\%$.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Formal Theorem Proving | MiniF2F (test) | Pass@135.2 | 100 | |
| Theorem Proving | miniF2F (val) | Success Rate37.3 | 59 | |
| Formal Theorem Proving | miniF2F Isabelle (val) | Success Rate37.3 | 41 | |
| Formal Theorem Proving | miniF2F Isabelle (test) | Success Rate35.2 | 39 | |
| Formal Theorem Proving | miniF2F (val) | Pass@137.3 | 15 |