Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

When LLM Agents Meet Graph Optimization: An Automated Data Quality Improvement Approach

About

Text-attributed graphs (TAGs) have become a key form of graph-structured data in modern data management and analytics, combining structural relationships with rich textual semantics for diverse applications. However, the effectiveness of analytical models, particularly graph neural networks (GNNs), is highly sensitive to data quality. Our empirical analysis shows that both conventional and LLM-enhanced GNNs degrade notably under textual, structural, and label imperfections, underscoring TAG quality as a key bottleneck for reliable analytics. Existing studies have explored data-level optimization for TAGs, but most focus on specific degradation types and target a single aspect like structure or label, lacking a systematic and comprehensive perspective on data quality improvement. To address this gap, we propose LAGA (Large Language and Graph Agent), a unified multi-agent framework for comprehensive TAG quality optimization. LAGA formulates graph quality control as a data-centric process, integrating detection, planning, action, and evaluation agents into an automated loop. It holistically enhances textual, structural, and label aspects through coordinated multi-modal optimization. Extensive experiments on 5 datasets and 16 baselines across 9 scenarios demonstrate the effectiveness, robustness and scalability of LAGA, confirming the importance of data-centric quality optimization for reliable TAG analytics.

Zhihan Zhang, Xunkai Li, Yilong Zuo, Henan Sun, Zhenjun Li, Bing Zhou, Rong-Hua Li, Guoren Wang• 2025

Related benchmarks

TaskDatasetResultRank
Node ClassificationCora
Accuracy90.88
583
Node ClassificationOgbn-arxiv
Accuracy74.12
304
Node ClassificationPhoto
Accuracy87.22
153
Node ClassificationCiteseer
Accuracy (%)83.22
105
Node Classificationogbn-arxiv Text Imbalance
Accuracy74.12
8
Node Classificationogbn-arxiv Structure Sparsity
Accuracy74.12
8
Node Classificationogbn-arxiv Structure Noise
Accuracy74.12
8
Node Classificationogbn-arxiv Structure Imbalance
Accuracy74.12
8
Node Classificationogbn-arxiv Label Sparsity
Accuracy74.12
8
Node Classificationogbn-arxiv Label Noise
Accuracy74.12
8
Showing 10 of 11 rows

Other info

Follow for update