| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| MiniF2F (test) | Seed-Prover | Success Rate99.6 | 93 | 1mo ago | |
| CoqGym (test) | ASTactic + hammer | Success Rate30 | 9 | 1mo ago | |
| seL4 | Stepwise (Mistral) | Proof Success Rate77.6 | 6 | 26d ago | |
| seL4 hard (test) | Stepwise (Mistral) | Proof Success Rate69.8 | 6 | 26d ago | |
| seL4 (test) | Stepwise (Mistral) | Proof Success Rate89 | 6 | 26d ago | |
| seL4 (val) | Stepwise (Mistral) | Proof Success Rate79.8 | 6 | 26d ago | |
| Metamath (val) | 700m policy+value a = 32 | Performance56.5 | 6 | 1mo ago | |
| seL4 proof corpus (full library) | Stepwise | Proof Lines Count6,235 | 5 | 26d ago | |
| FATE-X | Seed-Prover 1.5 | Pass Rate33 | 5 | 1mo ago | |
| FATE-M | AxProverBase | Pass Rate98 | 5 | 1mo ago | |
| FVELER hard (test) | FVEL-Llama-3-8B | Solved Proofs64 | 4 | 1mo ago | |
| FVELER (test) | FVEL-Llama-3-8B | Solved Proofs88 | 4 | 1mo ago | |
| HOList complex analysis corpus (val) | Subexpression sharing 12-hop GNN | Proofs Closed Rate49.95 | 3 | 1mo ago | |
| LeanCAT | AxProverBase | Pass Rate59 | 2 | 1mo ago | |
| IMO 2025 | Mechanic | P1 Score141 | 1 | 23d ago | |
| Principia Mathematica 23 Theorems Chapter 2 (All) | - | - | 0 | 1mo ago |