Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Informal-to-Formal Proving on miniF2F (test)
Loading...
24.6
Accuracy
DeepSeekMath-Base
17.32
19.21
21.1
22.99
Feb 5, 2024
Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
DeepSeekMath-Base
Size=7B, Prompting=Few...
2024.02
24.6
Llemma
Size=7B, Prompting=Few...
2024.02
22.1
Llemma
Size=34B, Prompting=Fe...
2024.02
21.3
Mistral
Size=7B, Prompting=Few...
2024.02
18
CodeLlama
Size=34B, Prompting=Fe...
2024.02
18
CodeLlama
Size=7B, Prompting=Few...
2024.02
17.6
Feedback
Search any
task
Search any
task