Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

MedBioRAG: Semantic Search and Retrieval-Augmented Generation with Large Language Models for Medical and Biological QA

About

Recent advancements in retrieval-augmented generation (RAG) have significantly enhanced the ability of large language models (LLMs) to perform complex question-answering (QA) tasks. In this paper, we introduce MedBioRAG, a retrieval-augmented model designed to improve biomedical QA performance through a combination of semantic and lexical search, document retrieval, and supervised fine-tuning. MedBioRAG efficiently retrieves and ranks relevant biomedical documents, enabling precise and context-aware response generation. We evaluate MedBioRAG across text retrieval, close-ended QA, and long-form QA tasks using benchmark datasets such as NFCorpus, TREC-COVID, MedQA, PubMedQA, and BioASQ. Experimental results demonstrate that MedBioRAG outperforms previous state-of-the-art (SoTA) models and the GPT-4o base model in all evaluated tasks. Notably, our approach improves NDCG and MRR scores for document retrieval, while achieving higher accuracy in close-ended QA and ROUGE scores in long-form QA. Our findings highlight the effectiveness of semantic search-based retrieval and LLM fine-tuning in biomedical applications.

Seonok Kim• 2025

Related benchmarks

TaskDatasetResultRank
Question AnsweringBioASQ
Accuracy98.32
57
Medical KnowledgeMedQA
Accuracy89.47
20
Close-ended QAPubMedQA
Accuracy85
10
Long-form QALiveQA (test)
ROUGE-127.33
4
Long-form QAMedicationQA (test)
ROUGE-127.73
4
Long-form QAPubMedQA (test)
ROUGE-137.49
4
Long-form QABioASQ (test)
ROUGE-134.3
4
Showing 7 of 7 rows

Other info

Follow for update