Enhancing Sentence Embedding with Generalized Pooling
About
Pooling is an essential component of a wide variety of sentence representation and embedding models. This paper explores generalized pooling methods to enhance sentence embedding. We propose vector-based multi-head attention that includes the widely used max pooling, mean pooling, and scalar self-attention as special cases. The model benefits from properly designed penalization terms to reduce redundancy in multi-head attention. We evaluate the proposed model on three different tasks: natural language inference (NLI), author profiling, and sentiment classification. The experiments show that the proposed model achieves significant improvement over strong sentence-encoding-based methods, resulting in state-of-the-art performances on four datasets. The proposed approach can be easily implemented for more problems than we discuss in this paper.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Natural Language Inference | SNLI (test) | Accuracy86.6 | 681 | |
| Natural Language Inference | MultiNLI matched (test) | Accuracy73.8 | 65 | |
| Natural Language Inference | MultiNLI Mismatched | Accuracy74 | 60 | |
| Natural Language Inference | MultiNLI mismatched (test) | Accuracy74 | 56 | |
| Sentiment Classification | Yelp | Accuracy66.55 | 24 | |
| Natural Language Inference | MultiNLI matched (in-domain) | Accuracy73.8 | 8 | |
| Author Profiling | Age | Accuracy82.63 | 7 |