Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Missing-Aware Multimodal Fusion for Unified Microservice Incident Management

About

Automated incident management is critical for microservice reliability. While recent unified frameworks leverage multimodal data for joint optimization, they unrealistically assume perfect data completeness. In practice, network fluctuations and agent failures frequently cause missing modalities. Existing approaches relying on static placeholders introduce imputation noise that masks anomalies and degrades performance. To address this, we propose ARMOR, a robust self-supervised framework designed for missing modality scenarios. ARMOR features: (i) a modality-specific asymmetric encoder that isolates distribution disparities among metrics, logs, and traces; and (ii) a missing-aware gated fusion mechanism utilizing learnable placeholders and dynamic bias compensation to prevent cross-modal interference from incomplete inputs. By employing self-supervised auto-regression with mask-guided reconstruction, ARMOR jointly optimizes anomaly detection (AD), failure triage (FT), and root cause localization (RCL). AD and RCL require no fault labels, while FT relies solely on failure-type annotations for the downstream classifier. Extensive experiments demonstrate that ARMOR achieves state-of-the-art performance under complete data conditions and maintains robust diagnostic accuracy even with severe modality loss.

Wenzhuo Qian, Hailiang Zhao, Ziqi Wang, Zhipeng Gao, Jiayi Chen, Zhiwei Ling, Shuiguang Deng• 2026

Related benchmarks

TaskDatasetResultRank
Root Cause LocalizationD1 complete data conditions
Top-1 Score82.1
7
Root Cause LocalizationD2 complete data conditions
Top-1 Accuracy81.5
7
Anomaly DetectionD1 complete data conditions
Precision92.5
6
Anomaly DetectionD2 complete data conditions
Precision99.3
6
Failure TriageD1 complete data conditions
Precision94.6
6
Failure TriageD2 complete data conditions
Precision88.2
6
Anomaly DetectionD1 (test)
Execution Time (s)5.23
2
Anomaly DetectionD2 (test)
Execution Time (s)6.71
2
Failure TriageD1 (test)
Execution Time (s)1.56
2
Failure TriageD2 (test)
Execution Time (s)1.45
2
Showing 10 of 12 rows

Other info

Follow for update