Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Theorem Proving on miniF2F Lean (val)
Loading...
60.2
Cumulative Pass Rate
DeepSeekMath-Base
22.448
32.249
42.05
51.851
May 23, 2022
Sep 21, 2022
Jan 21, 2023
May 23, 2023
Sep 22, 2023
Jan 22, 2024
May 23, 2024
Cumulative Pass Rate
Pass@64
Updated 4d ago
Evaluation Results
Method
Method
Links
Cumulative Pass Rate
Pass@64
DeepSeekMath-Base
Model size=7B, Generat...
2024.05
60.2
-
Evariste
Online training statem...
2022.05
58.6
-
Curriculum Learning
Model size=837M, Gener...
2024.05
58.6
-
Curriculum Learning
Model size=837M, Gener...
2024.05
47.3
-
Curriculum Learning
Model size=837M, Gener...
2024.05
41.2
-
Curriculum Learning
Model size=837M, Gener...
2024.05
33.6
-
Proof Artifact Co-Training
Model size=837M, Gener...
2024.05
29.3
-
GPT-4-turbo 0409
Generation times=64
2024.05
25.4
-
DeepSeekMath-Base
Model size=7B, Generat...
2024.05
25.4
-
Proof Artifact Co-Training
Model size=837M, Gener...
2024.05
23.9
-
Supervised
Train time (A100 days)=50
2022.05
-
38.5
GPT-f
Train time (A100 days)...
2022.05
-
47.3
Evariste-1d
Online training statem...
2022.05
-
46.7
Evariste-7d
Online training statem...
2022.05
-
47.5
Feedback
Search any
task
Search any
task