Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Formal Theorem Proving on mathlib (val)
Loading...
62.6
Pass@1
θ_mathlib (expert iterated on mathlib-train)
46.064
50.357
54.65
58.943
Feb 3, 2022
Pass@1
Pass@8
Pass@64
Updated 4d ago
Evaluation Results
Method
Method
Links
Pass@1
Pass@8
Pass@64
θ_mathlib (expert iterated on mathlib-train)
search width (d)=512,...
2022.02
62.6
70.7
75.8
θ_full (expert iterated on full curriculum)
search width (d)=512,...
2022.02
61.7
69.8
75.3
θ₁ (value-function based search)
search width (d)=512,...
2022.02
56.3
66.3
72
θ1
d (expansions)=512, e...
2022.02
56.3
66.3
-
θ1 (outcome objective)
d (expansions)=512, e...
2022.02
55.6
65.9
-
θ0 (PACT setup)
d (expansions)=512, e...
2022.02
48.5
57.6
-
PACT
search width (d)=512,...
2022.02
48.4
-
-
PACT
d (expansions)=512, e...
2022.02
48.4
-
-
θ0
d (expansions)=512, e...
2022.02
46.7
57.5
-
Feedback
Search any
task
Search any
task