Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Biomedical and Clinical English Model Packages in the Stanza Python NLP Library

About

We introduce biomedical and clinical English model packages for the Stanza Python NLP library. These packages offer accurate syntactic analysis and named entity recognition capabilities for biomedical and clinical text, by combining Stanza's fully neural architecture with a wide variety of open datasets as well as large-scale unsupervised biomedical and clinical text data. We show via extensive experiments that our packages achieve syntactic analysis and named entity recognition performance that is on par with or surpasses state-of-the-art results. We further show that these models do not compromise speed compared to existing toolkits when GPU acceleration is available, and are made easy to download and use with Stanza's Python interface. A demonstration of our packages is available at: http://stanza.run/bio.

Yuhao Zhang, Yuhui Zhang, Peng Qi, Christopher D. Manning, Curtis P. Langlotz• 2020

Related benchmarks

TaskDatasetResultRank
Named Entity RecognitionBC5CDR
F1 Score88.08
59
Named Entity RecognitionNCBI-disease
F1 Score87.49
29
Named Entity RecognitionAnatEM
F1 Score88.18
21
Named Entity RecognitionBC4CHEMD
F1 Score89.65
14
Named Entity RecognitionLinnaeus preprocessed (test)
Micro-F1 (excl. O)88.27
4
Named Entity RecognitionNBCI-Disease preprocessed (test)
Micro F1 (Excl. O)87.49
4
Named Entity RecognitionAnatEM preprocessed (test)
Micro-F1 (excl O)88.18
4
Named Entity RecognitionBC5CDR preprocessed (test)
Micro F1 (excl O)88.08
4
Named Entity RecognitionBC4CHEMD preprocessed (test)
Micro F1 (excl O)89.65
4
Named Entity RecognitionSpecies800 preprocessed (test)
Micro-F1 (excl. O)76.35
4
Showing 10 of 15 rows

Other info

Follow for update