Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Order-Embeddings of Images and Language

About

Hypernymy, textual entailment, and image captioning can be seen as special cases of a single visual-semantic hierarchy over words, sentences, and images. In this paper we advocate for explicitly modeling the partial order structure of this hierarchy. Towards this goal, we introduce a general method for learning ordered representations, and show how it can be applied to a variety of tasks involving images and language. We show that the resulting representations improve performance over current approaches for hypernym prediction and image-caption retrieval.

Ivan Vendrov, Ryan Kiros, Sanja Fidler, Raquel Urtasun• 2015

Related benchmarks

TaskDatasetResultRank
Natural Language InferenceSNLI (test)
Accuracy88.6
681
Image-to-Text RetrievalMS-COCO 5K (test)
R@123.3
299
Text-to-Image RetrievalMSCOCO 5K (test)
R@131.7
286
Natural Language InferenceSNLI (train)
Accuracy98.8
154
Image RetrievalMS-COCO 1K (test)
R@137.9
128
Text-to-Image RetrievalMSCOCO (1K test)
R@137.9
104
Image-to-Text RetrievalMSCOCO (1K test)
R@146.7
82
Caption RetrievalMS COCO Karpathy 1k (test)
R@146.7
62
Link PredictionWordNet noun hierarchy (transitive closure) (test)
F184.1
40
Caption RetrievalMS COCO Karpathy 5k (test)
R@131.7
26
Showing 10 of 30 rows

Other info

Follow for update