Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Informal-to-formal proving on miniF2F (val)
Loading...
25.8
Proven Theorems Rate
DeepSeekMath-Base
14.2768
17.2684
20.26
23.2516
Oct 16, 2023
Nov 3, 2023
Nov 22, 2023
Dec 11, 2023
Dec 29, 2023
Jan 17, 2024
Feb 5, 2024
Proven Theorems Rate
Updated 4d ago
Evaluation Results
Method
Method
Links
Proven Theorems Rate
DeepSeekMath-Base
Size=7B, Prompting=Few...
2024.02
25.8
LLEMMA-34b
Decoding=greedy, Proof...
2023.10
21.03
Llemma
Size=34B, Prompting=Fe...
2024.02
21
LLEMMA-7b
Decoding=greedy, Proof...
2023.10
20.6
Llemma
Size=7B, Prompting=Few...
2024.02
20.6
Mistral
Size=7B, Prompting=Few...
2024.02
18.9
CodeLlama
Size=34B, Prompting=Fe...
2024.02
18.5
Code Llama 34b
Decoding=greedy, Proof...
2023.10
18.45
Code Llama 7b
Decoding=greedy, Proof...
2023.10
16.31
CodeLlama
Size=7B, Prompting=Few...
2024.02
16.3
Sledgehammer
Decoding=greedy, Proof...
2023.10
14.72
Feedback
Search any
task
Search any
task