pysentimiento: A Python Toolkit for Opinion Mining and Social NLP tasks
About
In recent years, the extraction of opinions and information from user-generated text has attracted a lot of interest, largely due to the unprecedented volume of content in Social Media. However, social researchers face some issues in adopting cutting-edge tools for these tasks, as they are usually behind commercial APIs, unavailable for other languages than English, or very complex to use for non-experts. To address these issues, we present pysentimiento, a comprehensive multilingual Python toolkit designed for opinion mining and other Social NLP tasks. This open-source library brings state-of-the-art models for Spanish, English, Italian, and Portuguese in an easy-to-use Python library, allowing researchers to leverage these techniques. We present a comprehensive assessment of performance for several pre-trained language models across a variety of tasks, languages, and datasets, including an evaluation of fairness in the results.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Hate Speech Detection | SBIC | Total F135 | 11 | |
| Text Classification | SBIC | Total Metric0.0014 | 11 | |
| Hate Speech Detection | CREHate | Total F1 Score33 | 11 | |
| Text Classification | CREHate | Total Score-0.0023 | 11 |