RuBioRoBERTa: a pre-trained biomedical language model for Russian language biomedical text mining

About

This paper presents several BERT-based models for Russian language biomedical text mining (RuBioBERT, RuBioRoBERTa). The models are pre-trained on a corpus of freely available texts in the Russian biomedical domain. With this pre-training, our models demonstrate state-of-the-art results on RuMedBench - Russian medical language understanding benchmark that covers a diverse set of tasks, including text classification, question answering, natural language inference, and named entity recognition.

Alexander Yalunin, Alexander Nesterov, Dmitriy Umerenkov• 2022

Related benchmarks

Task	Dataset	Result	Rank
Cancer risk prediction	EHR-based cancer screening dataset 2016-2023 (test)	Average Precision9.3		18

Showing 1 of 1 rows

Other info

Follow for update

@wizwand_team Discord