VnCoreNLP: A Vietnamese Natural Language Processing Toolkit
About
We present an easy-to-use and fast toolkit, namely VnCoreNLP---a Java NLP annotation pipeline for Vietnamese. Our VnCoreNLP supports key natural language processing (NLP) tasks including word segmentation, part-of-speech (POS) tagging, named entity recognition (NER) and dependency parsing, and obtains state-of-the-art (SOTA) results for these tasks. We release VnCoreNLP to provide rich linguistic annotations to facilitate research work on Vietnamese NLP. Our VnCoreNLP is open-source and available at: https://github.com/vncorenlp/VnCoreNLP
Thanh Vu, Dat Quoc Nguyen, Dai Quoc Nguyen, Mark Dras, Mark Johnson• 2018
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Word Segmentation | VLSP 2013 | F1 Score97.8 | 9 | |
| Tokenisation | Wikipedia/OpenWebText | F1 Score96.5 | 9 |
Showing 2 of 2 rows