Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Chain of Simulation: A Dual-Mode Reasoning Framework for Large Language Models with Dynamic Problem Routing

About

We present Chain of Simulation (CoS), a novel dual-mode reasoning framework that dynamically routes problems to specialized reasoning strategies in Large Language Models (LLMs). Unlike existing uniform prompting approaches, CoS employs three distinct reasoning modes: (1) computational flow with self-consistency for mathematical problems, (2) symbolic state tracking with JSON representations for spatial reasoning, and (3) hybrid fact-extraction for multi-hop inference. Through comprehensive evaluation on GSM8K, StrategyQA, and bAbI benchmarks using four state-of-the-art models (Gemma-3 27B, LLaMA-3.1 8B, Mistral 7B, and Qwen-2.5 14B), we demonstrate that CoS achieves 71.5% accuracy on GSM8K (1.0% absolute improvement), 90.0% on StrategyQA (2.5% improvement), and 19.0% on bAbI (65.2% relative improvement) compared to the strongest baselines. The analysis reveals that problem-specific mode selection is crucial, with computational mode achieving 81.2% accuracy when correctly applied to mathematical problems, while misrouting leads to 0% accuracy. We provide detailed algorithms for mode selection, state tracking, and answer extraction, establishing CoS as an effective approach for improving LLM reasoning without additional training. The framework provides superior trade-offs between accuracy and efficiency compared to Self-Consistency, achieving comparable performance at 54% lower computational cost.

Saeid Sheikhi• 2026

Related benchmarks

TaskDatasetResultRank
Mathematical ReasoningGSM8K (test)
Accuracy84
797
ReasoningStrategyQA (test)
Factuality Acc96
28
Spatial ReasoningbAbI (test)
Accuracy22
20
Grade school mathematics reasoningGSM8K sampled 50 (test)
Accuracy71.5
5
Multi-hop commonsense reasoningStrategyQA sampled 50 (val)
Accuracy90
5
Spatial ReasoningbAbI Task 1 qa1 50 samples
Accuracy19
5
Showing 6 of 6 rows

Other info

Follow for update