Hierarchical Attention Generates Better Proofs
About
Large language models (LLMs) have shown promise in formal theorem proving, but their token-level processing often fails to capture the inherent hierarchical nature of mathematical proofs. We introduce \textbf{Hierarchical Attention}, a regularization method that aligns LLMs' attention mechanisms with mathematical reasoning structures. Our approach establishes a five-level hierarchy from foundational elements to high-level concepts, ensuring structured information flow in proof generation. Experiments demonstrate that our method improves proof success rates by 2.05\% on miniF2F and 1.69\% on ProofNet while reducing proof complexity by 23.81\% and 16.50\% respectively. The code is available at https://github.com/Car-pe/HAGBP.
Jianlong Chen, Chao Li, Yang Yuan, Andrew C Yao• 2025
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Formal Theorem Proving | MiniF2F (test) | Pass@131.56 | 100 | |
| Automated Theorem Proving | MiniF2F (test) | Success Rate27.87 | 93 | |
| Theorem Proving | miniF2F (val) | Success Rate34.02 | 59 | |
| Theorem Proving | ProofNet (test) | Pass Rate (%)15.25 | 12 | |
| Theorem Proving | ProofNet (val) | Accuracy11.86 | 11 | |
| Formal Theorem Proving | ProofNet (val) | Pass Rate9.04 | 6 |
Showing 6 of 6 rows