Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Biomedical named entity recognition using BERT in the machine reading comprehension framework

About

Recognition of biomedical entities from literature is a challenging research focus, which is the foundation for extracting a large amount of biomedical knowledge existing in unstructured texts into structured formats. Using the sequence labeling framework to implement biomedical named entity recognition (BioNER) is currently a conventional method. This method, however, often cannot take full advantage of the semantic information in the dataset, and the performance is not always satisfactory. In this work, instead of treating the BioNER task as a sequence labeling problem, we formulate it as a machine reading comprehension (MRC) problem. This formulation can introduce more prior knowledge utilizing well-designed queries, and no longer need decoding processes such as conditional random fields (CRF). We conduct experiments on six BioNER datasets, and the experimental results demonstrate the effectiveness of our method. Our method achieves state-of-the-art (SOTA) performance on the BC4CHEMD, BC5CDR-Chem, BC5CDR-Disease, NCBI-Disease, BC2GM and JNLPBA datasets, achieving F1-scores of 92.92%, 94.19%, 87.83%, 90.04%, 85.48% and 78.93%, respectively.

Cong Sun, Zhihao Yang, Lei Wang, Yin Zhang, Hongfei Lin, Jian Wang• 2020

Related benchmarks

TaskDatasetResultRank
Phenotype MiningBiolarkGSC+
F1 Score0.498
30
Phenotype ExtractionCSC (n=116 docs)
F1 Score38.2
21
Rare disease mention extractionRareDis n=1,011 docs (test)
Micro-F171.7
19
Rare disease mention extractionMIMIC3 RD Code n=79 docs (test)
Micro-F1 Score4.6
19
Rare disease mention extractionMIMIC3-RD Entity n=117 docs (test)
Micro-F14.6
19
Phenotype MiningCSC
Precision61.4
9
Rare Disease MiningRareDis
Precision73
7
Rare Disease MiningMIMIC RD Code 3
Precision2.5
7
Rare Disease MiningMIMIC RD Entity 3
Precision1.8
7
Showing 9 of 9 rows

Other info

Follow for update