AnyPPG: An ECG-Guided PPG Foundation Model Trained on Over 100,000 Hours of Recordings for Holistic Health Profiling
About
Photoplethysmography (PPG) is widely used as a non-invasive and accessible modality for continuous health monitoring. However, despite being a peripheral hemodynamic signal intrinsically coupled with systemic circulation, existing research has largely confined its scope to a narrow range of cardiovascular tasks, leaving a fundamental question underexplored: to what extent can PPG support holistic health profiling beyond traditional cardiovascular applications? To answer this question, we present AnyPPG, a foundation model-based framework designed to reveal the broader health-profiling potential of PPG. To ensure reliable performance for this investigation, AnyPPG is pretrained with ECG guidance on the most diverse PPG corpus with synchronized ECG to date, comprising over 100,000 hours of recordings from six large-scale data sources. This pretraining yields robust and physiologically grounded PPG representations that provide a reliable basis for subsequent analysis. Building upon this pretrained model, we conduct a systematic investigation into the association between PPG and holistic health through, to our knowledge, the first PPG-based phenome-wide disease detection study, spanning 1,468 disease phenotypes in more than 15,000 subjects. Our evaluation demonstrates the effectiveness of AnyPPG: across eight clinical and wearable datasets covering 15 downstream tasks, it achieves the best performance in 13 tasks. More importantly, in the phenome-wide analysis, AnyPPG exhibits meaningful discriminative capability (AUC $\ge$ 0.70) for 307 phenotypes across 16 distinct phecode chapters, including 230 non-circulatory conditions such as dementia and chronic kidney disease, many of which have rarely been explored using PPG. Collectively, these findings indicate that easily acquired PPG signals encode rich health-related information extending well beyond conventional cardiovascular assessment.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Classification | PPG Classification Benchmark Suite | Stress Accuracy98.15 | 14 | |
| Activity Classification | Activity Classification Dataset | AUC85.57 | 7 | |
| Affect Classification | Affect Classification Dataset | AUC84.07 | 7 | |
| Systolic BP Regression | Systolic Blood Pressure Regression Dataset | MAE13.09 | 7 | |
| Average Heart Rate Regression | Average Heart Rate | MAE4.135 | 7 | |
| Stress Classification | Stress Classification Dataset | AUC98.15 | 7 | |
| Signal Quality Classification | Signal Quality Classification Dataset | AUC95.23 | 7 | |
| Diastolic BP Regression | Diastolic Blood Pressure Regression Dataset | MAE9.211 | 7 | |
| Heart Rate Regression | Heart Rate Regression Dataset | MAE2.773 | 7 | |
| Human Identification | Human Identification Dataset | AUC98.95 | 7 |