Exploring Multi-Modal Data with Tool-Augmented LLM Agents for Precise Causal Discovery
About
Causal discovery is an imperative foundation for decision-making across domains, such as smart health, AI for drug discovery and AIOps. Traditional statistical causal discovery methods, while well-established, predominantly rely on observational data and often overlook the semantic cues inherent in cause-and-effect relationships. The advent of Large Language Models (LLMs) has ushered in an affordable way of leveraging the semantic cues for knowledge-driven causal discovery, but the development of LLMs for causal discovery lags behind other areas, particularly in the exploration of multi-modal data. To bridge the gap, we introduce MATMCD, a multi-agent system powered by tool-augmented LLMs. MATMCD has two key agents: a Data Augmentation agent that retrieves and processes modality-augmented data, and a Causal Constraint agent that integrates multi-modal data for knowledge-driven reasoning. The proposed design of the inner-workings ensures successful cooperation of the agents. Our empirical study across seven datasets suggests the significant potential of multi-modality enhanced causal discovery.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Causal Discovery | Sachs real-world data protein signaling network | SHD17 | 26 | |
| Causal Discovery | AutoMPG | Structural Hamming Distance1 | 12 | |
| Causal Discovery | DWDClimate | Structural Hamming Distance4 | 12 | |
| Causal Discovery | Asia discrete (test) | Precision66 | 11 | |
| Causal Discovery | Child discrete (test) | Precision56 | 11 | |
| Root Cause Analysis | AIOps Product Review and Cloud Computing (test) | MAP@530 | 9 |