Hypergraph as Language
About
Large language models (LLMs) have recently shown strong potential in modeling relational structures. However, existing approaches remain fundamentally graph-centric: they focus on processing pairwise graph structures into tokens that LLMs can understand. In contrast, many real-world relational patterns do not naturally conform to the pairwise-edge assumption, and are better modeled as high-order associations in hypergraphs. For hypergraph structures, existing methods often fail to preserve the native semantics that multiple objects are jointly connected by the same high-order relation, limiting their ability to exploit complex structures. To address this limitation, we put forth the "Hypergraph as Language" perspective and propose Hyper-Align, a hypergraph-native alignment framework for large language models. Hyper-Align compiles the query-object-centered hypergraph context into hypergraph tokens directly consumable by a base LLM. Specifically, we introduce Hypergraph Incidence Detail Template with Overview (HIDT-O), which serializes high-order association structures into a fixed-shape hybrid template combining local incidence details and overview-level summaries. We then design a Hypergraph Incidence Projector (HIP), which maps native high-order incidence structures into the LLM token space through explicit semantic-structural decoupling and bidirectional message passing between vertices and hyperedges. We further define a concrete Hypergraph-as-Language input protocol, which jointly feeds hypergraph tokens and textual prompts into a frozen base LLM, supporting both vertex-level and hyperedge-level tasks under a unified question-answering paradigm. To systematically evaluate different methods in hypergraph structural modeling, we introduce HyperAlign-Bench. Extensive experiments show that Hyper-Align significantly outperforms existing methods across in-domain and zero-shot evaluations.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Node Classification | IMDB | -- | 211 | |
| Graph Classification | obgn-arXiv (test) | Accuracy78.2 | 28 | |
| Vertex Classification | Arxiv-HG In-domain (test) | Accuracy76.9 | 18 | |
| Hyperedge Classification | Cora-CC | Accuracy75.7 | 9 | |
| Hyperedge Classification | Pubmed | Accuracy77.6 | 9 | |
| Hyperedge Classification | DBLP | Accuracy64.6 | 9 | |
| Hyperedge Classification | IMDB | Accuracy44.9 | 9 | |
| Vertex Classification | Cora-CC | Accuracy74.8 | 9 | |
| Vertex Classification | Pubmed | Accuracy77.5 | 9 | |
| Vertex Classification | DBLP | Accuracy67.2 | 9 |