Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

FormalML

Benchmarks

Task NameDataset NameSOTA ResultTrend
Theorem ProvingFormalML Hard Level-3
Solved Rate95
6
Formal Theorem ProvingFormalML Hard
Proof Length11.2
6
Automated Theorem ProvingFormalML-Hard (Machine Learning Theory) 1.0 (test)
Output Tokens (k)0.4
6
Showing 3 of 3 rows