Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Hyperbolic Attention Networks

About

We introduce hyperbolic attention networks to endow neural networks with enough capacity to match the complexity of data with hierarchical and power-law structure. A few recent approaches have successfully demonstrated the benefits of imposing hyperbolic geometry on the parameters of shallow networks. We extend this line of work by imposing hyperbolic geometry on the activations of neural networks. This allows us to exploit hyperbolic geometry to reason about embeddings produced by deep networks. We achieve this by re-expressing the ubiquitous mechanism of soft attention in terms of operations defined for hyperboloid and Klein models. Our method shows improvements in terms of generalization on neural machine translation, learning on graphs and visual question answering tasks while keeping the neural representations compact.

Caglar Gulcehre, Misha Denil, Mateusz Malinowski, Ali Razavi, Razvan Pascanu, Karl Moritz Hermann, Peter Battaglia, Victor Bapst, David Raposo, Adam Santoro, Nando de Freitas• 2018

Related benchmarks

TaskDatasetResultRank
Node ClassificationCora
Accuracy83.4
885
Node ClassificationCiteseer
Accuracy93.9
804
Node ClassificationPubmed
Accuracy68.1
742
Visual Question AnsweringCLEVR (test)
Overall Accuracy95.7
61
Machine TranslationIWSLT
BLEU33.8
31
Node ClassificationPPI
Node Classification Accuracy (PPI)0.989
30
Image ClassificationImageNet 300 Epochs (val)
Top-1 Acc49.19
7
Link PredictionCora
Accuracy79.2
4
Link PredictionPubmed
Accuracy90.8
4
Showing 9 of 9 rows

Other info

Follow for update