TimeXL: Explainable Multi-modal Time Series Prediction with LLM-in-the-Loop

About

Time series analysis provides essential insights for real-world system dynamics and informs downstream decision-making, yet most existing methods often overlook the rich contextual signals present in auxiliary modalities. To bridge this gap, we introduce TimeXL, a multi-modal prediction framework that integrates a prototype-based time series encoder with three collaborating Large Language Models (LLMs) to deliver more accurate predictions and interpretable explanations. First, a multi-modal prototype-based encoder processes both time series and textual inputs to generate preliminary forecasts alongside case-based rationales. These outputs then feed into a prediction LLM, which refines the forecasts by reasoning over the encoder's predictions and explanations. Next, a reflection LLM compares the predicted values against the ground truth, identifying textual inconsistencies or noise. Guided by this feedback, a refinement LLM iteratively enhances text quality and triggers encoder retraining. This closed-loop workflow-prediction, critique (reflect), and refinement-continuously boosts the framework's performance and interpretability. Empirical evaluations on four real-world datasets demonstrate that TimeXL achieves up to 8.9% improvement in AUC and produces human-centric, multi-modal explanations, highlighting the power of LLM-driven reasoning for time series prediction.

Yushan Jiang, Wenchao Yu, Geon Lee, Dongjin Song, Kijung Shin, Wei Cheng, Yanchi Liu, Haifeng Chen• 2025

Related benchmarks

Task	Dataset	Result
Time Series Regression	Finance dataset (test)	RMSE4.161	19
Time-series classification	Weather (test)	F1 Score69.6	19
Time-series classification	Finance (test)	F1 Score63.1	19
Time-series classification	Healthcare TP (test)	F1 Score98.7	19
Time-series classification	Healthcare MT (test)	F1 Score95.6	19

Showing 5 of 5 rows

Other info

Follow for update

@wizwand_team Discord