Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

PriMock57: A Dataset Of Primary Care Mock Consultations

About

Recent advances in Automatic Speech Recognition (ASR) have made it possible to reliably produce automatic transcripts of clinician-patient conversations. However, access to clinical datasets is heavily restricted due to patient privacy, thus slowing down normal research practices. We detail the development of a public access, high quality dataset comprising of57 mocked primary care consultations, including audio recordings, their manual utterance-level transcriptions, and the associated consultation notes. Our work illustrates how the dataset can be used as a benchmark for conversational medical ASR as well as consultation note generation from transcripts.

Alex Papadopoulos Korfiatis, Francesco Moramarco, Radmila Sarac, Aleksandar Savkov• 2022

Related benchmarks

TaskDatasetResultRank
Transcript AlignmentCommon Voice English 8 (test)
Character GLE65.8
16
Transcript AlignmentPriMock57 (PM57) 1 (test)
Character GLE76.7
16
Transcript AlignmentTED-LIUM v3 (test)
Character GLE78.1
16
Speech AlignmentCommon Voice Portuguese
Character GLE59.2
3
Speech AlignmentCommon Voice Spanish
Character GLE (%)60.9
3
Speech AlignmentCommon Voice Turkish
Character GLE40.4
3
Speech AlignmentCommon Voice German
Character GLE (%)47
3
Speech AlignmentCommon Voice Polish
Character GLE54
3
Speech AlignmentCommon Voice Indonesian
Character GLE56.5
3
Speech AlignmentCommon Voice Swahili
Character GLE45.3
3
Showing 10 of 11 rows

Other info

Code

Follow for update