Beyond Prefixes: Graph-as-Memory Cross-Attention for Knowledge Graph Completion with Large Language Models

About

Fusing Knowledge Graphs with Large Language Models (LLMs) is crucial for knowledge-intensive tasks like knowledge graph completion. Existing LLM-based approaches typically inject graph information via prefix concatenation, resulting in shallow interactions that fail to support fine-grained evidence retrieval during generation. Beyond prefixes, we propose Graph-as-Memory Tuning (GMT), a new paradigm that represents local graph structure as explicit graph memory and injects it into LLMs via deep, token-wise cross-attention. Specifically, GMT first employs a Semantic Graph Module to encode context-aware semantics from local neighborhoods guided by knowledge-enhanced relations, and compresses them into a fixed number of graph memory tokens. A Graph-as-Memory Cross-Attention Fusion Module then integrates these tokens into multiple Transformer layers, allowing LLM hidden state to dynamically retrieve relevant graph evidence. To enable efficient adaptation, GMT applies LoRA only to the memory cross-attention while keeping the base LLM frozen. Extensive experiments show that GMT significantly outperforms prefix-tuning and other strong baselines, providing more potent signals for robust reasoning. The code is published at https://github.com/tongruiliu/GMT.

Ruitong Liu, Boxu Lin, Peize Li, Siyuan Li, Yunjia Wu, Te Sun, Chaohan Wu• 2025

Related benchmarks

Task	Dataset	Result
Link Prediction	FB15k-237	MRR48.8	342
Link Prediction	WN18RR	Hits@1070.3	219
Triple classification	UMLS	Accuracy94.55	18
Triple classification	CoDEx-s	Accuracy89.01	12
Triple classification	FB15K-237N	Accuracy84.1	12

Showing 5 of 5 rows

Other info

Follow for update

@wizwand_team Discord