Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

TS-Agent: Understanding and Reasoning Over Raw Time Series via Iterative Insight Gathering

About

Large language models (LLMs) exhibit strong symbolic and compositional reasoning, yet they struggle with time series question answering as the data is typically transformed into an LLM-compatible modality, e.g., serialized text, plotted images, or compressed time series embeddings. Such conversions impose representation bottlenecks, often require cross-modal alignment or finetuning, and can exacerbate hallucination and knowledge leakage. To address these limitations, we propose TS-Agent, an agentic, tool-grounded framework that uses LLMs strictly for iterative evidence-based reasoning, while delegating statistical and structural extraction to time series analytical tools operating on raw sequences. Our framework solves time series tasks through an evidence-driven agentic process: (1) it alternates between thinking, tool execution, and observation in a ReAct-style loop, (2) records intermediate results in an explicit evidence log and corrects the reasoning trace via a self-refinement critic, and (3) enforces a final answer-verification step to prevent hallucinations and leakage. Across four benchmarks spanning time series understanding and reasoning, TS-Agent matches or exceeds strong text-based, vision-based, and time-series language model baselines, with the largest gains on reasoning tasks where multimodal LLMs are prone to hallucination and knowledge leakage in zero-shot settings.

Penghang Liu, Elizabeth Fons, Annita Vapsi, Mohsen Ghassemi, Svitlana Vyetrenko, Daniel Borrajo, Vamsi K. Potluru, Manuela Veloso• 2025

Related benchmarks

TaskDatasetResultRank
Time Series ReasoningTSandLang Two TS
Accuracy57.3
12
Time Series ReasoningTSandLang One TS
Accuracy80.5
12
Time Series UnderstandingTSExam
Pattern Recognition71
10
Time Series ReasoningMMTS-Bench
Average Score60
9
time series feature understandingtime series feature understanding benchmark
Trend Score98
8
Showing 5 of 5 rows

Other info

Follow for update