CUICurate: A GraphRAG-based Framework for Automated Clinical Concept Curation for NLP applications

About

Background: Clinical named entity recognition tools commonly map free text to Unified Medical Language System (UMLS) Concept Unique Identifiers (CUIs). For many downstream tasks, however, the clinically meaningful unit is not a single CUI but a concept set comprising related synonyms, subtypes, and associated concepts. Constructing these sets is labour-intensive, inconsistently performed, and poorly supported by existing tools. Methods We present CUICurate, a graph-based retrieval-augmented generation (GraphRAG) framework for automated UMLS concept set curation. A UMLS knowledge graph (KG) was constructed and embedded for semantic retrieval. Candidate CUIs were retrieved using graph-based expansion and then filtered and classified using large language models (GPT-5 and Qwen3-32B). The framework was evaluated on five lexically heterogeneous clinical concepts against a manually curated concept sets and gold-standard concept sets. Results CUICurate produced substantially larger and more complete concept sets than the manual benchmarks. A single retrieval configuration across concepts achieved high recall of definitive concepts with manageable candidate sets. GPT-5 outperformed manual curation for all concepts and retained at least 95% of definitive gold-standard CUIs, while Qwen3-32B achieved comparable but slightly lower performance. Many missed concepts were not observed in 10,000 MIMIC-III notes. CUICurate infrastructure and end-to-end processing was inexpensive and stable across runs. Conclusions CUICurate offers a scalable, reproducible and cost-efficient approach for generating clinician-reviewable UMLS concept sets tailored to clinical natural language processing and phenotyping applications.

Victoria Blake, Jamie Novak, Mathew Miller, Sze-yuan Ooi, Blanca Gallego• 2026

Related benchmarks

Task	Dataset	Result
Graph Retrieval	UMLS manual concept sets M	Recall98	5
Clinical Concept Classification	UMLS Five Target Concepts Definitive class (test)	Macro Recall71	3
LLM Filtering	Manually adjudicated gold-standard CUIs Chronic Heart Failure v1 (test)	CUIs Count137	3
LLM Filtering	Manually adjudicated gold-standard CUIs Fluid Overload v1 (test)	CUIs Count77	3
LLM Filtering	Manually adjudicated gold-standard CUIs Ischaemic Stroke v1 (test)	Total CUIs277	3
LLM Filtering	Manually adjudicated gold-standard CUIs LV Systolic Dysfunction v1 (test)	CUI Count90	3
LLM Filtering	Manually adjudicated gold-standard CUIs Poor Mobility v1 (test)	CUIs Count205	3
Clinical Concept Classification	UMLS Five Target Concepts All Classes (test)	Macro Recall0.72	3
Graph Retrieval	Chronic heart failure concept set (val)	Manual CUI Count98	1
Graph Retrieval	Fluid overload concept set (Manual val)	CUIs (Manual Count)30	1

Showing 10 of 13 rows

Other info

Follow for update

@wizwand_team Discord