Text Embeddings Reveal (Almost) As Much As Text
About
How much private information do text embeddings reveal about the original text? We investigate the problem of embedding \textit{inversion}, reconstructing the full text represented in dense text embeddings. We frame the problem as controlled generation: generating text that, when reembedded, is close to a fixed point in latent space. We find that although a na\"ive model conditioned on the embedding performs poorly, a multi-step method that iteratively corrects and re-embeds text is able to recover $92\%$ of $32\text{-token}$ text inputs exactly. We train our model to decode text embeddings from two state-of-the-art embedding models, and also show that our model can recover important personal information (full names) from a dataset of clinical notes. Our code is available on Github: \href{https://github.com/jxmorris12/vec2text}{github.com/jxmorris12/vec2text}.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Embedding-to-Abstract Reconstruction (emb2abs) | PubMedRCT (test) | Semantic Consistency82 | 27 | |
| Text Reconstruction from Embeddings | MS Marco | BLEU-112.82 | 20 | |
| Text Reconstruction from Embeddings | Pubmed | BLEU-111.39 | 20 | |
| Abstract Generation from Embeddings | 5-task 1.2M dataset | Win Rate (Orig)0.01 | 8 | |
| Top-1 White-box Attack | FiQA (test) | ASR48.3 | 4 | |
| Top-1 White-box Attack | TREC DL 19 (test) | ASR73.5 | 4 | |
| Top-1 White-box Attack | TREC DL 20 (test) | ASR67.9 | 4 | |
| Top-1 White-box Attack | NQ (test) | ASR6.3 | 4 | |
| Top-1 White-box Attack | Quora (test) | ASR0.3 | 4 | |
| Top-1 White-box Attack | Touché 2020 (test) | ASR40.8 | 4 |