Formula-One Prompting: A Composable Equation-First Prefix for Applied Mathematics
About
This paper introduces Formula Prompting (FP) and Formula-One Prompting (F-1), two single-call methods that elicit governing equations before solving applied-math problems. Chain-of-Thought (CoT) and Program-of-Thought (PoT) prompting improve mathematical reasoning by eliciting reasoning traces or code-like structures learned during pretraining. This suggests a diagnostic question: which useful pretraining patterns remain under-elicited? Using infini-gram-mini, we scan 81.7 trillion pretraining tokens and find that, in curated corpora such as DataComp-LM, equation-centered language appears 121x more often than code and 3.79x more often than step-by-step narration, yet standard prompting methods do not explicitly elicit equation formulation. FP asks the model to formalize a problem's governing equations before solving; F-1 extends FP with a composable Phase 2 that selects Direct, CoT, or PoT-style solving in the same call. Across five reasoning models and four applied-math benchmarks (finance, physics, cryptography, competition math), F-1 outperforms CoT by 5.76 pp and PoT by 8.42 pp on average, with the largest gain of 13.30 pp on FinanceMath, while topping the accuracy-token efficiency frontier at only 68 prompt tokens of overhead. Variant ablations identify the equation-formalization prefix, not the strategy menu, as the primary driver: adding CoT or PoT on top of the prefix yields no further gain, and 73.3% of remaining failures occur downstream of a correct Phase-1 equation.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Mathematical Reasoning | OlympiadBench | Accuracy65.81 | 57 | |
| Mathematical Reasoning | IMO-Bench | Accuracy57.02 | 57 | |
| Mathematical Reasoning | FinanceMath | Accuracy64 | 20 | |
| Mathematical Reasoning | Overall Macro-average | Accuracy (%)70.97 | 20 | |
| Mathematical Reasoning | AICrypto | Accuracy0.985 | 20 | |
| Cryptographic Proof | AICrypto (test) | Efficiency Ratio1.42 | 4 | |
| Cryptography Reasoning | AICrypto n=18 | Tokens per Correct8.43e+3 | 4 | |
| Financial Calculation | FinanceMath (test) | Efficiency Ratio3.04 | 4 | |
| Financial Mathematical Reasoning | FinanceMath n=200 | Tokens per Correct4.37e+3 | 4 | |
| Olympiad Mathematical Reasoning | OlympiadBench | Tokens per Correct2.55e+4 | 4 |