Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Generative Language Modeling for Automated Theorem Proving

About

We explore the application of transformer-based language models to automated theorem proving. This work is motivated by the possibility that a major limitation of automated theorem provers compared to humans -- the generation of original mathematical terms -- might be addressable via generation from language models. We present an automated prover and proof assistant, GPT-f, for the Metamath formalization language, and analyze its performance. GPT-f found new short proofs that were accepted into the main Metamath library, which is to our knowledge, the first time a deep-learning based system has contributed proofs that were adopted by a formal mathematics community.

Stanislas Polu, Ilya Sutskever• 2020

Related benchmarks

TaskDatasetResultRank
Formal Theorem ProvingMiniF2F (test)
Pass@136.6
100
Theorem ProvingminiF2F Lean (test)
Pass@6436.6
24
Theorem ProvingminiF2F Lean (val)--
10
Automated Theorem ProvingMetamath (val)
Performance56.5
6
Formal Theorem ProvingMetamath set.mm (val)
Performance Score29.22
3
Theorem ProvingminiF2F Lean (curriculum)
Pass@6430.6
3
Showing 6 of 6 rows

Other info

Follow for update