Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Fully-automated sleep staging: multicenter validation of a generalizable deep neural network for Parkinson's disease and isolated REM sleep behavior disorder

About

Isolated REM sleep behavior disorder (iRBD) is a key prodromal marker of Parkinson's disease (PD), and video-polysomnography (vPSG) remains the diagnostic gold standard. However, manual sleep staging is particularly challenging in neurodegenerative diseases due to EEG abnormalities and fragmented sleep, making PSG assessments a bottleneck for deploying new RBD screening technologies at scale. We adapted U-Sleep, a deep neural network, for generalizable sleep staging in PD and iRBD. A pretrained U-Sleep model, based on a large, multisite non-neurodegenerative dataset (PUB; 19,236 PSGs across 12 sites), was fine-tuned on research datasets from two centers (Lundbeck Foundation Parkinson's Disease Research Center (PACE) and the Cologne-Bonn Cohort (CBC); 112 PD, 138 iRBD, 89 age-matched controls. The resulting model was evaluated on an independent dataset from the Danish Center for Sleep Medicine (DCSM; 81 PD, 36 iRBD, 87 sleep-clinic controls). A subset of PSGs with low agreement between the human rater and the model (Cohen's $\kappa$ < 0.6) was re-scored by a second blinded human rater to identify sources of disagreement. Finally, we applied confidence-based thresholds to optimize REM sleep staging. The pretrained model achieved mean $\kappa$ = 0.81 in PUB, but $\kappa$ = 0.66 when applied directly to PACE/CBC. By fine-tuning the model, we developed a generalized model with $\kappa$ = 0.74 on PACE/CBC (p < 0.001 vs. the pretrained model). In DCSM, mean and median $\kappa$ increased from 0.60 to 0.64 (p < 0.001) and 0.64 to 0.69 (p < 0.001), respectively. In the interrater study, PSGs with low agreement between the model and the initial scorer showed similarly low agreement between human scorers. Applying a confidence threshold increased the proportion of correctly identified REM sleep epochs from 85% to 95.5%, while preserving sufficient (> 5 min) REM sleep for 95% of subjects.

Jesper Str{\o}m, Casper Skj{\ae}rb{\ae}k, Natasha Becker Bertelsen, Steffen Torpe Simonsen, Niels Okkels, David Bertram, Sinah R\"ottgen, Konstantin Kufer, Kaare B. Mikkelsen, Marit Otto, Poul J{\o}rgen Jennum, Per Borghammer, Michael Sommerauer, Preben Kidmose• 2026

Related benchmarks

TaskDatasetResultRank
Sleep Stage ClassificationSHHS
F1 Macro80
23
Sleep Stage ClassificationSleep-EDF ST
MF183
11
Sleep StagingPhysio 2018
Macro F181
4
Sleep Stage ClassificationABC
Macro F181
2
Sleep Stage ClassificationChat
Macro F186
2
Sleep Stage ClassificationHOMEPAP
Macro F178
2
Sleep Stage ClassificationMESA
Macro F181
2
Sleep Stage ClassificationSOF
Macro F179
2
Sleep Stage ClassificationCCSHS
Macro F10.84
2
Sleep Stage ClassificationCFS
Macro F177
2
Showing 10 of 12 rows

Other info

Follow for update