Pixels to Graphs by Associative Embedding

About

Graphs are a useful abstraction of image content. Not only can graphs represent details about individual objects in a scene but they can capture the interactions between pairs of objects. We present a method for training a convolutional neural network such that it takes in an input image and produces a full graph definition. This is done end-to-end in a single stage with the use of associative embeddings. The network learns to simultaneously identify all of the elements that make up a graph and piece them together. We benchmark on the Visual Genome dataset, and demonstrate state-of-the-art performance on the challenging task of scene graph generation.

Alejandro Newell, Jia Deng• 2017

Related benchmarks

Task	Dataset	Result
Scene Graph Generation	Visual Genome (test)	R@509.7	86
Scene Graph Classification	Visual Genome (test)	Recall@10030	63
Predicate Classification	Visual Genome	Recall@5068	61
Predicate Classification	Visual Genome (test)	R@5068	50
Scene Graph Classification	Visual Genome	R@5026.5	45
Scene Graph Detection	Visual Genome	Recall@10011.3	31
Predicate Classification	Visual Genome 1.0 (test)	R@10075.2	22
Relation Prediction	VG200	R@5054.1	13
Scene Graph Generation	Visual Genome 1.0 (test)	Recall@509.7	11
Predicate Classification	Visual Genome v1.2 (test)	Recall@5054.1	11

Showing 10 of 13 rows

Other info

Follow for update

@wizwand_team Discord