Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Exploring Nearest Neighbor Approaches for Image Captioning

About

We explore a variety of nearest neighbor baseline approaches for image captioning. These approaches find a set of nearest neighbor images in the training set from which a caption may be borrowed for the query image. We select a caption for the query image by finding the caption that best represents the "consensus" of the set of candidate captions gathered from the nearest neighbor images. When measured by automatic evaluation metrics on the MS COCO caption evaluation server, these approaches perform as well as many recent approaches that generate novel captions. However, human studies show that a method that generates novel captions is still preferred over the nearest neighbor approach.

Jacob Devlin, Saurabh Gupta, Ross Girshick, Margaret Mitchell, C. Lawrence Zitnick• 2015

Related benchmarks

TaskDatasetResultRank
Image CaptioningMS-COCO (test)
CIDEr89
117
Visual Question AnsweringCOCO-QA (test)
WUPS (IoU=0.9)56.98
51
Image CaptioningCOCO 2014 (test)
CIDEr0.916
44
Visual Question AnsweringDAQUAR single-word answers portion
Accuracy31.85
11
Showing 4 of 4 rows

Other info

Follow for update