A General Framework for Producing Interpretable Semantic Text Embeddings
About
Semantic text embedding is essential to many tasks in Natural Language Processing (NLP). While black-box models are capable of generating high-quality embeddings, their lack of interpretability limits their use in tasks that demand transparency. Recent approaches have improved interpretability by leveraging domain-expert-crafted or LLM-generated questions, but these methods rely heavily on expert input or well-prompt design, which restricts their generalizability and ability to generate discriminative questions across a wide range of tasks. To address these challenges, we introduce \algo{CQG-MBQA} (Contrastive Question Generation - Multi-task Binary Question Answering), a general framework for producing interpretable semantic text embeddings across diverse tasks. Our framework systematically generates highly discriminative, low cognitive load yes/no questions through the \algo{CQG} method and answers them efficiently with the \algo{MBQA} model, resulting in interpretable embeddings in a cost-effective manner. We validate the effectiveness and interpretability of \algo{CQG-MBQA} through extensive experiments and ablation studies, demonstrating that it delivers embedding quality comparable to many advanced black-box models while maintaining inherently interpretability. Additionally, \algo{CQG-MBQA} outperforms other interpretable text embedding methods across various downstream tasks.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Semantic Textual Similarity | STS tasks (STS12, STS13, STS14, STS15, STS16, STS-B, SICK-R) | STS12 Score69.21 | 195 | |
| Information Retrieval | MS Marco | NDCG@1062.21 | 56 | |
| Information Retrieval | SCIDOCS | nDCG@108.67 | 24 | |
| Information Retrieval | ArguAna | nDCG@1047.75 | 19 | |
| Information Retrieval | FQA | nDCG@1018.63 | 19 | |
| Information Retrieval | SciFact | nDCG@1032.8 | 19 | |
| Information Retrieval | NFC | nDCG@109.74 | 19 | |
| Clustering | MTEB Clustering v1 (test) | TNG40 | 18 |