Let Your Graph Do the Talking: Encoding Structured Data for LLMs

About

How can we best encode structured data into sequential form for use in large language models (LLMs)? In this work, we introduce a parameter-efficient method to explicitly represent structured data for LLMs. Our method, GraphToken, learns an encoding function to extend prompts with explicit structured information. Unlike other work which focuses on limited domains (e.g. knowledge graph representation), our work is the first effort focused on the general encoding of structured data to be used for various reasoning tasks. We show that explicitly representing the graph structure allows significant improvements to graph reasoning tasks. Specifically, we see across the board improvements - up to 73% points - on node, edge and, graph-level tasks from the GraphQA benchmark.

Bryan Perozzi, Bahare Fatemi, Dustin Zelle, Anton Tsitsulin, Mehran Kazemi, Rami Al-Rfou, Jonathan Halcrow• 2024

Related benchmarks

Task	Dataset	Result
Knowledge Graph Question Answering	WebQSP	Hit@157.05	174
Knowledge Base Question Answering	WEBQSP (test)	Hit@142.39	145
Recommendation	MovieLens 1M (test)	--	116
Knowledge Graph Question Answering	WEBQSP (test)	Hit57.05	85
Graph Understanding	GraphSQA Macro-level Tasks 1.0 (test)	Score44.93	55
Graph Understanding	GraphSQA Micro-level Tasks 1.0 (test)	Score31.61	55
Graph Understanding	GraphSQA Overall 1.0 (test)	Score35.5	55
Recommendation	MovieLens 1M	--	49
Question Answering	WebQSP	--	35
Recommendation	MovieLens 20M (test)	Accuracy47.3	24

Showing 10 of 42 rows

Other info

Follow for update

@wizwand_team Discord