Formula-One Prompting: Equation-First Reasoning For Applied Mathematics
About
LLMs encode vast mathematical knowledge including governing equations from pretraining on equation-rich corpora, yet existing prompting methods, including Chain-of-Thought (CoT) and Program-of-Thought (PoT), do not explicitly elicit equation formulation as a reasoning stage. We propose Formula-One Prompting (F-1), a single-call, two-phase approach that fills this equation gap by using mathematical equations as an intermediate representation before solving through natural flow reasoning. F-1 first formulates governing equations from problem descriptions; the model then naturally selects a solving strategy among CoT, PoT, or direct computation based on the formalized equation structure, without explicit routing rules. Results across five models and four benchmarks show F-1 outperforms CoT by +5.76% and PoT by +8.42% on average, winning 53 out of 60 benchmark-model comparisons (88.3%). Gains are largest in applied domains: +13.30% on FinanceMath over CoT, and within OlympiadBench, larger gains on physics (+2.55%) than pure math (+0.44%). Per-problem analysis confirms equation formalization is the primary driver.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Mathematical Reasoning | OlympiadBench | Accuracy65.81 | 57 | |
| Mathematical Reasoning | IMO-Bench | Accuracy57.02 | 20 | |
| Mathematical Reasoning | FinanceMath | Accuracy64 | 20 | |
| Mathematical Reasoning | Overall Macro-average | Accuracy (%)70.97 | 20 | |
| Mathematical Reasoning | AICrypto | Accuracy0.985 | 20 | |
| Cryptographic Proof | AICrypto (test) | Efficiency Ratio1.42 | 4 | |
| Cryptography Reasoning | AICrypto n=18 | Tokens per Correct8.43e+3 | 4 | |
| Financial Calculation | FinanceMath (test) | Efficiency Ratio3.04 | 4 | |
| Financial Mathematical Reasoning | FinanceMath n=200 | Tokens per Correct4.37e+3 | 4 | |
| Olympiad Mathematical Reasoning | OlympiadBench | Tokens per Correct2.55e+4 | 4 |