TableMind++: An Uncertainty-Aware Programmatic Agent for Tool-Augmented Table Reasoning
About
Table reasoning requires models to jointly perform semantic understanding and precise numerical operations. Most existing methods rely on a single-turn reasoning paradigm over tables which suffers from context overflow and weak numerical sensitivity. To address these limitations, we previously proposed TableMind as a tuning-based autonomous programmatic agent that simulates human-like interaction within a lightweight large language model (LLM). TableMind internalizes planning, action, and reflection through a two-stage training strategy involving supervised fine-tuning (SFT) on filtered high-quality data and reinforcement learning (RL) via a multi-perspective reward and the Rank-Aware Policy Optimization (RAPO) algorithm. While TableMind establishes a solid foundation for programmatic agents, the inherent stochasticity of LLMs remains a critical challenge that leads to hallucinations. In this paper, we extend this foundation to TableMind++ by introducing a novel uncertainty-aware inference framework to mitigate hallucinations. Specifically, we propose memory-guided plan pruning to retrieve historical trajectories for validating and filtering out logically flawed plans to address epistemic uncertainty. To ensure execution precision, we introduce confidence-based action refinement which monitors token-level probabilities to detect and self-correct syntactic noise for aleatoric uncertainty mitigation. Finally, we employ dual-weighted trajectory aggregation to synthesize a robust consensus from multiple reasoning paths. Extensive experiments on diverse benchmarks demonstrate that TableMind++ consistently outperforms previous baselines and proprietary models to validate the effectiveness of integrating autonomous training with uncertainty quantification. Our code is available.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Table Fact Verification | TabFact (test) | Accuracy93.73 | 136 | |
| Table Question Answering | WikiTQ (test) | Accuracy78.07 | 130 | |
| Financial Question Answering | FinQA (test) | Accuracy45.48 | 57 | |
| Table Mathematical Reasoning | TabMWP (test) | Accuracy99.57 | 15 | |
| Hierarchical Table Question Answering | HiTab (test) | Accuracy (%)73.69 | 15 |