Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

VerifAI: A Verifiable Open-Source Search Engine for Biomedical Question Answering

About

We introduce VerifAI, an open-source expert system for biomedical question answering that integrates retrieval-augmented generation (RAG) with a novel post-hoc claim verification mechanism. Unlike standard RAG systems, VerifAI ensures factual consistency by decomposing generated answers into atomic claims and validating them against retrieved evidence using a fine-tuned natural language inference (NLI) engine. The system comprises three modular components: (1) a hybrid Information Retrieval (IR) module optimized for biomedical queries (MAP@10 of 42.7%), (2) a citation-aware Generative Component fine-tuned on a custom dataset to produce referenced answers, and (3) a Verification Component that detects hallucinations with state-of-the-art accuracy, outperforming GPT-4 on the HealthVer benchmark. Evaluations demonstrate that VerifAI significantly reduces hallucinated citations compared to zero-shot baselines and provides a transparent, verifiable lineage for every claim. The full pipeline, including code, models, and datasets, is open-sourced to facilitate reliable AI deployment in high-stakes domains.

Milo\v{s} Ko\v{s}prdi\'c, Adela Ljaji\'c, Bojana Ba\v{s}aragin, Darija Medvecki, Lorenzo Cassano, Nikola Milo\v{s}evi\'c• 2026

Related benchmarks

TaskDatasetResultRank
Question AnsweringBioASQ
SAME_CONCLUSION Score85.71
10
Scientific Claim VerificationSciFact (test)
Precision (NE)88
4
Question AnsweringPQAref 908 samples (test)
Max References Per Answer Count3
3
Question AnsweringPQAref 10 samples (test)
Recall67
3
Question AnsweringPQAref 823 samples (test)
Missed Abstract Count10
3
Claim VerificationBioASQ retrieval
Precision (NE)89
2
Claim VerificationourIR retrieval
Precision (NE)88
2
Claim VerificationHealthVer (test)--
2
Showing 8 of 8 rows

Other info

Follow for update