Let Your Graph Do the Talking: Encoding Structured Data for LLMs
About
How can we best encode structured data into sequential form for use in large language models (LLMs)? In this work, we introduce a parameter-efficient method to explicitly represent structured data for LLMs. Our method, GraphToken, learns an encoding function to extend prompts with explicit structured information. Unlike other work which focuses on limited domains (e.g. knowledge graph representation), our work is the first effort focused on the general encoding of structured data to be used for various reasoning tasks. We show that explicitly representing the graph structure allows significant improvements to graph reasoning tasks. Specifically, we see across the board improvements - up to 73% points - on node, edge and, graph-level tasks from the GraphQA benchmark.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Knowledge Base Question Answering | WEBQSP (test) | Hit@142.39 | 143 | |
| Graph Understanding | GraphSQA Macro-level Tasks 1.0 (test) | Score44.93 | 55 | |
| Graph Understanding | GraphSQA Micro-level Tasks 1.0 (test) | Score31.61 | 55 | |
| Graph Understanding | GraphSQA Overall 1.0 (test) | Score35.5 | 55 | |
| Recommendation | MovieLens 1M (test) | Recall@30.622 | 34 | |
| Knowledge Graph Question Answering | WEBQSP (test) | Hit57.05 | 30 | |
| Recommendation | MovieLens 20M (test) | Accuracy47.3 | 24 | |
| Graph Node Classification and Link Prediction | Daily Life | n-F192.5 | 23 | |
| Graph Node Classification and Link Prediction | Multimedia | n-F174.57 | 23 | |
| Graph Node Classification and Link Prediction | HuggingFace | n-F164.42 | 23 |