Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

DIFFormer: Scalable (Graph) Transformers Induced by Energy Constrained Diffusion

About

Real-world data generation often involves complex inter-dependencies among instances, violating the IID-data hypothesis of standard learning paradigms and posing a challenge for uncovering the geometric structures for learning desired instance representations. To this end, we introduce an energy constrained diffusion model which encodes a batch of instances from a dataset into evolutionary states that progressively incorporate other instances' information by their interactions. The diffusion process is constrained by descent criteria w.r.t.~a principled energy function that characterizes the global consistency of instance representations over latent structures. We provide rigorous theory that implies closed-form optimal estimates for the pairwise diffusion strength among arbitrary instance pairs, which gives rise to a new class of neural encoders, dubbed as DIFFormer (diffusion-based Transformers), with two instantiations: a simple version with linear complexity for prohibitive instance numbers, and an advanced version for learning complex structures. Experiments highlight the wide applicability of our model as a general-purpose encoder backbone with superior performance in various tasks, such as node classification on large graphs, semi-supervised image/text classification, and spatial-temporal dynamics prediction.

Qitian Wu, Chenxiao Yang, Wentao Zhao, Yixuan He, David Wipf, Junchi Yan• 2023

Related benchmarks

TaskDatasetResultRank
Node ClassificationPubmed
Accuracy81.6
865
Node ClassificationPubMed (test)
Accuracy36.1
586
Node Classificationogbn-arxiv (test)
Accuracy72.21
497
Node ClassificationPhoto
Mean Accuracy95.1
374
Node ClassificationChameleon (test)
Mean Accuracy44.44
335
Node ClassificationwikiCS
Accuracy73.46
329
Node ClassificationRoman-Empire
Accuracy79.1
327
Node Classificationamazon-ratings
Accuracy47.84
309
Node ClassificationActor (test)
Mean Accuracy0.3673
286
Node ClassificationREDDIT
Accuracy55.3
216
Showing 10 of 68 rows

Other info

Follow for update