DESSERT: An Efficient Algorithm for Vector Set Search with Vector Set Queries
About
We study the problem of $\textit{vector set search}$ with $\textit{vector set queries}$. This task is analogous to traditional near-neighbor search, with the exception that both the query and each element in the collection are $\textit{sets}$ of vectors. We identify this problem as a core subroutine for semantic search applications and find that existing solutions are unacceptably slow. Towards this end, we present a new approximate search algorithm, DESSERT (${\bf D}$ESSERT ${\bf E}$ffeciently ${\bf S}$earches ${\bf S}$ets of ${\bf E}$mbeddings via ${\bf R}$etrieval ${\bf T}$ables). DESSERT is a general tool with strong theoretical guarantees and excellent empirical performance. When we integrate DESSERT into ColBERT, a state-of-the-art semantic search model, we find a 2-5x speedup on the MS MARCO and LoTTE retrieval benchmarks with minimal loss in recall, underscoring the effectiveness and practical applicability of our proposal.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Information Retrieval | ArguAna | QPS576 | 9 | |
| Information Retrieval | Quora | QPS284 | 9 | |
| Information Retrieval | NQ | QPS38 | 8 | |
| Passage retrieval | MS-MARCO (test) | Latency (ms)9.5 | 6 | |
| Multi-Vector Retrieval | SCIDOCS | QPS285 | 5 |