Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Formal Mathematics Statement Curriculum Learning

About

We explore the use of expert iteration in the context of language modeling applied to formal mathematics. We show that at same compute budget, expert iteration, by which we mean proof search interleaved with learning, dramatically outperforms proof search only. We also observe that when applied to a collection of formal statements of sufficiently varied difficulty, expert iteration is capable of finding and solving a curriculum of increasingly difficult problems, without the need for associated ground-truth proofs. Finally, by applying this expert iteration to a manually curated set of problem statements, we achieve state-of-the-art on the miniF2F benchmark, automatically solving multiple challenging problems drawn from high school olympiads.

Stanislas Polu, Jesse Michael Han, Kunhao Zheng, Mantas Baksys, Igor Babuschkin, Ilya Sutskever• 2022

Related benchmarks

TaskDatasetResultRank
Formal Theorem ProvingMiniF2F (test)
Pass@129.6
100
Automated Theorem ProvingMiniF2F (test)
Success Rate29.6
93
Theorem ProvingminiF2F (val)
Success Rate33.6
59
Theorem ProvingminiF2F Lean (test)
Pass@6436.6
24
Formal Theorem ProvingminiF2F (val)
Pass@133.6
15
Theorem ProvingminiF2F Lean (val)
Cumulative Pass Rate58.6
10
Formal Theorem Provingmathlib (val)
Pass@162.6
9
Formal Theorem Provingmathlib (test)
Pass@163
3
Showing 8 of 8 rows

Other info

Code

Follow for update