Generative Language Modeling for Automated Theorem Proving
About
We explore the application of transformer-based language models to automated theorem proving. This work is motivated by the possibility that a major limitation of automated theorem provers compared to humans -- the generation of original mathematical terms -- might be addressable via generation from language models. We present an automated prover and proof assistant, GPT-f, for the Metamath formalization language, and analyze its performance. GPT-f found new short proofs that were accepted into the main Metamath library, which is to our knowledge, the first time a deep-learning based system has contributed proofs that were adopted by a formal mathematics community.
Stanislas Polu, Ilya Sutskever• 2020
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Formal Theorem Proving | MiniF2F (test) | Pass@136.6 | 100 | |
| Theorem Proving | miniF2F Lean (test) | Pass@6436.6 | 24 | |
| Theorem Proving | miniF2F Lean (val) | -- | 10 | |
| Automated Theorem Proving | Metamath (val) | Performance56.5 | 6 | |
| Formal Theorem Proving | Metamath set.mm (val) | Performance Score29.22 | 3 | |
| Theorem Proving | miniF2F Lean (curriculum) | Pass@6430.6 | 3 |
Showing 6 of 6 rows