Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Can Multimodal LLMs Perform Time Series Anomaly Detection?

About

Time series anomaly detection (TSAD) has been a long-standing pillar problem in Web-scale systems and online infrastructures, such as service reliability monitoring, system fault diagnosis, and performance optimization. Large language models (LLMs) have demonstrated unprecedented capabilities in time series analysis, the potential of multimodal LLMs (MLLMs), particularly vision-language models, in TSAD remains largely under-explored. One natural way for humans to detect time series anomalies is through visualization and textual description. It motivates our research question: Can multimodal LLMs perform time series anomaly detection? Existing studies often oversimplify the problem by treating point-wise anomalies as special cases of range-wise ones or by aggregating point anomalies to approximate range-wise scenarios. They limit our understanding for realistic scenarios such as multi-granular anomalies and irregular time series. To address the gap, we build a VisualTimeAnomaly benchmark to comprehensively investigate zero-shot capabilities of MLLMs for TSAD, progressively from point-, range-, to variate-wise anomalies, and extends to irregular sampling conditions. Our study reveals several key insights in multimodal MLLMs for TSAD. Built on these findings, we propose a MLLMs-based multi-agent framework TSAD-Agents to achieve automatic TSAD. Our framework comprises scanning, planning, detection, and checking agents that synergistically collaborate to reason, plan, and self-reflect to enable automatic TSAD. These agents adaptively invoke tools such as traditional methods and MLLMs and dynamically switch between text and image modalities to optimize detection performance.

Xiongxiao Xu, Haoran Wang, Yueqing Liang, Philip S. Yu, Yue Zhao, Kai Shu• 2025

Related benchmarks

TaskDatasetResultRank
Time Series Anomaly DetectionSMAP
Affiliation F191.65
29
Time Series Anomaly DetectionNAB
Affiliation-F162.46
11
Time Series Anomaly DetectionVisualTimeAnomaly Point 1.0 (test)
Precision55.4
10
Time Series Anomaly DetectionVisualTimeAnomaly Range 1.0 (test)
Precision (%)30.5
10
Time Series Anomaly DetectionVisualTimeAnomaly Irr-Point 1.0 (test)
Precision17.6
10
Time Series Anomaly DetectionVisualTimeAnomaly Irr-Range 1.0 (test)
Precision32
10
Time Series Anomaly DetectionYahoo
Affiliation-F161.03
8
Showing 7 of 7 rows

Other info

Follow for update