Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Generative Semantic Hashing Enhanced via Boltzmann Machines

About

Generative semantic hashing is a promising technique for large-scale information retrieval thanks to its fast retrieval speed and small memory footprint. For the tractability of training, existing generative-hashing methods mostly assume a factorized form for the posterior distribution, enforcing independence among the bits of hash codes. From the perspectives of both model representation and code space size, independence is always not the best assumption. In this paper, to introduce correlations among the bits of hash codes, we propose to employ the distribution of Boltzmann machine as the variational posterior. To address the intractability issue of training, we first develop an approximate method to reparameterize the distribution of a Boltzmann machine by augmenting it as a hierarchical concatenation of a Gaussian-like distribution and a Bernoulli distribution. Based on that, an asymptotically-exact lower bound is further derived for the evidence lower bound (ELBO). With these novel techniques, the entire model can be optimized efficiently. Extensive experimental results demonstrate that by effectively modeling correlations among different bits within a hash code, our model can achieve significant performance gains.

Lin Zheng, Qinliang Su, Dinghan Shen, Changyou Chen• 2020

Related benchmarks

TaskDatasetResultRank
Unsupervised document hashing20Newsgroups
Precision63.59
36
Unsupervised document hashingTMC
Precision76.32
32
Unsupervised document hashingReuters
Precision0.8482
32
Showing 3 of 3 rows

Other info

Follow for update