Chain of Thought in Order: Discovering Learning-Friendly Orders for Arithmetic
About
The chain of thought, i.e., step-by-step reasoning, is one of the fundamental mechanisms of Transformers. While the design of intermediate reasoning steps has been extensively studied and shown to critically influence performance on mathematical, multi-step reasoning tasks, the ordering of these steps has received little attention, despite its significant effect on the difficulty of reasoning. This study addresses a novel task of unraveling the chain of thought -- reordering decoder input tokens into a learning-friendly sequence for Transformers, for learning arithmetic tasks. The proposed pipeline first trains a Transformer on a mixture of target sequences arranged in different orders and then identifies benign orders as those with fast loss drops in the early stage. As the search space grows factorially in sequence length, we propose a two-stage hierarchical approach for inter- and intra-block reordering. Experiments on seven order-sensitive arithmetic tasks show that our method identifies a learning-friendly order out of a few billion candidates. Notably, it recovered the reverse-digit order reported in prior studies for the multiplication task.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| CUBIC | Synthetic Arithmetic Tasks | Success Rate1.00e+4 | 14 | |
| MLP | Synthetic Arithmetic Tasks | Success Rate100 | 14 | |
| RELU | Synthetic Arithmetic Tasks | Success Rate0.996 | 14 | |
| SINE | Synthetic Arithmetic Tasks | Success Rate100 | 14 | |
| Square | Synthetic Arithmetic Tasks | Success Rate100 | 14 | |
| TRIANGLE | Synthetic Arithmetic Tasks | Success Rate100 | 14 | |
| Multiplication | PROD L=20 | Success Rate (Discovered)98.2 | 2 | |
| Multiplication | PROD L=10 | Success Rate (Discovered)100 | 1 | |
| Multiplication | PROD L=12 | Success Rate (Discovered)51.4 | 1 |