Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Medical Triage as Pairwise Ranking: A Benchmark for Urgency in Patient Portal Messages

About

Medical triage is the task of allocating medical resources and prioritizing patients based on medical need. This paper introduces the first large-scale public dataset for studying medical triage in the context of asynchronous outpatient portal messages. Our novel task formulation views patient message triage as a pairwise inference problem, where we train LLMs to choose `"which message is more medically urgent" in a head-to-head tournament-style re-sort of a physician's inbox. Our novel benchmark PMR-Bench contains 1569 unique messages and 2,000+ high-quality test pairs for pairwise medical urgency assessment alongside a scalable training data generation pipeline. PMR-Bench includes samples that contain both unstructured patient-written messages alongside real electronic health record (EHR) data, emulating a real-world medical triage scenario. We develop a novel automated data annotation strategy to provide LLMs with in-domain guidance on this task. The resulting data is used to train two model classes, UrgentReward and UrgentSFT, leveraging Bradley-Terry and next token prediction objective, respectively to perform pairwise urgency classification. We find that UrgentSFT achieves top performance on PMR-Bench, with UrgentReward showing distinct advantages in low-resource settings. For example, UrgentSFT-8B and UrgentReward-8B provide a 15- and 16-point boost, respectively, on inbox sorting metrics over off-the-shelf 8B models. Paper resources can be found at https://tinyurl.com/Patient-Message-Triage

Joseph Gatto, Parker Seegmiller, Timothy Burdick, Philip Resnik, Roshnik Rahat, Sarah DeLozier, Sarah M. Preum• 2026

Related benchmarks

TaskDatasetResultRank
Inbox SortingPMR-Reddit (test)
T-NDCG@100.77
16
Inbox SortingPMR-Synth (test)
T-NDCG@100.73
16
Inbox SortingPMR-Real (test)
T-NDCG@1077
14
Pairwise classificationPMR-Reddit Easy
Accuracy98
14
Pairwise classificationPMR-Reddit (Med)
Accuracy86
14
Pairwise classificationPMR-Reddit (Hard)
Accuracy87
14
Pairwise classificationPMR-Reddit (Total)
Accuracy88
14
Pairwise classificationPMR-Synth Med
Accuracy77
14
Pairwise classificationPMR-Synth (Total)
Accuracy73
14
Pairwise classificationPMR-Synth Easy
Accuracy93
14
Showing 10 of 15 rows

Other info

Follow for update