Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Draft, Sketch, and Prove: Guiding Formal Theorem Provers with Informal Proofs

About

The formalization of existing mathematical proofs is a notoriously difficult process. Despite decades of research on automation and proof assistants, writing formal proofs remains arduous and only accessible to a few experts. While previous studies to automate formalization focused on powerful search algorithms, no attempts were made to take advantage of available informal proofs. In this work, we introduce Draft, Sketch, and Prove (DSP), a method that maps informal proofs to formal proof sketches, and uses the sketches to guide an automated prover by directing its search to easier sub-problems. We investigate two relevant setups where informal proofs are either written by humans or generated by a language model. Our experiments and ablation studies show that large language models are able to produce well-structured formal sketches that follow the same reasoning steps as the informal proofs. Guiding an automated prover with these sketches enhances its performance from 20.9% to 39.3% on a collection of mathematical competition problems.

Albert Q. Jiang, Sean Welleck, Jin Peng Zhou, Wenda Li, Jiacheng Liu, Mateja Jamnik, Timoth\'ee Lacroix, Yuhuai Wu, Guillaume Lample• 2022

Related benchmarks

TaskDatasetResultRank
Formal Theorem ProvingMiniF2F (test)
Pass@139.3
128
Logical reasoningFOLIO
Accuracy53.7
123
Automated Theorem ProvingMiniF2F (test)
Success Rate49.2
93
Theorem ProvingminiF2F (val)
Success Rate42.6
59
Logical reasoningZebraLogic
Accuracy48
48
Logical reasoningHLE
Accuracy0.064
46
Logical reasoningAR-LSAT
Accuracy67.8
44
Logical reasoningProofWriter
Accuracy42.8
44
Formal Theorem ProvingminiF2F Isabelle (val)
Success Rate42.6
41
Formal Theorem ProvingminiF2F Isabelle (test)
Success Rate39.3
39
Showing 10 of 11 rows

Other info

Code

Follow for update