Generative Language Modeling for Automated Theorem Proving

About

We explore the application of transformer-based language models to automated theorem proving. This work is motivated by the possibility that a major limitation of automated theorem provers compared to humans -- the generation of original mathematical terms -- might be addressable via generation from language models. We present an automated prover and proof assistant, GPT-f, for the Metamath formalization language, and analyze its performance. GPT-f found new short proofs that were accepted into the main Metamath library, which is to our knowledge, the first time a deep-learning based system has contributed proofs that were adopted by a formal mathematics community.

Stanislas Polu, Ilya Sutskever• 2020

Related benchmarks

Task	Dataset	Result
Formal Theorem Proving	MiniF2F (test)	Pass@136.6	128
Theorem Proving	miniF2F Lean (test)	Pass@6436.6	24
Theorem Proving	miniF2F Lean (val)	--	10
Automated Theorem Proving	Metamath (val)	Performance56.5	6
Formal Theorem Proving	Metamath set.mm (val)	Performance Score29.22	3
Theorem Proving	miniF2F Lean (curriculum)	Pass@6430.6	3

Showing 6 of 6 rows

Other info

Follow for update

@wizwand_team Discord