A digital biomarker of diabetes from smartphone-based vascular signals

Robert Avram, Jeffrey E Olgin, Peter Kuhar, J Weston Hughes, Gregory M Marcus, Mark J Pletcher, Kirstin Aschbacher, Geoffrey H Tison, Robert Avram, Jeffrey E Olgin, Peter Kuhar, J Weston Hughes, Gregory M Marcus, Mark J Pletcher, Kirstin Aschbacher, Geoffrey H Tison

Abstract

The global burden of diabetes is rapidly increasing, from 451 million people in 2019 to 693 million by 20451. The insidious onset of type 2 diabetes delays diagnosis and increases morbidity2. Given the multifactorial vascular effects of diabetes, we hypothesized that smartphone-based photoplethysmography could provide a widely accessible digital biomarker for diabetes. Here we developed a deep neural network (DNN) to detect prevalent diabetes using smartphone-based photoplethysmography from an initial cohort of 53,870 individuals (the 'primary cohort'), which we then validated in a separate cohort of 7,806 individuals (the 'contemporary cohort') and a cohort of 181 prospectively enrolled individuals from three clinics (the 'clinic cohort'). The DNN achieved an area under the curve for prevalent diabetes of 0.766 in the primary cohort (95% confidence interval: 0.750-0.782; sensitivity 75%, specificity 65%) and 0.740 in the contemporary cohort (95% confidence interval: 0.723-0.758; sensitivity 81%, specificity 54%). When the output of the DNN, called the DNN score, was included in a regression analysis alongside age, gender, race/ethnicity and body mass index, the area under the curve was 0.830 and the DNN score remained independently predictive of diabetes. The performance of the DNN in the clinic cohort was similar to that in other validation datasets. There was a significant and positive association between the continuous DNN score and hemoglobin A1c (P ≤ 0.001) among those with hemoglobin A1c data. These findings demonstrate that smartphone-based photoplethysmography provides a readily attainable, non-invasive digital biomarker of prevalent diabetes.

Conflict of interest statement

Competing Interests: Dr. Olgin has received research funding from Samsung and iBeat. Dr. Marcus has received research funding from Medtronic, Jawbone, and Eight. Dr. Aschbacher received funding from Jawbone Health Hub. Peter Kuhar is an employee of Azumio. Dr. Tison has received research grants from Janssen Pharmaceuticals and Myokardia and is an advisor to Cardiogram, Inc. None of the remaining authors have potential conflicts of interest. Azumio provided no financial support for this study and only provided access to the data. Data analysis, interpretation and decision to submit the manuscript were performed independently from Azumio.

Figures

Extended Data Fig. 1. Baseline characteristics of…
Extended Data Fig. 1. Baseline characteristics of the Primary Cohort by Diabetes Status
Primary Cohort sample size was 53,870 individual people. Where data was only available for subgroups of the full cohort, subgroup sample size is denoted by N. Differences in means of continuous variables between 2 groups were compared using the two-sample t-test. Differences in proportions of categorical variables between 2 groups were compared using the Chi-Squared test. Tests of significance were 2 sided. Abbreviations: bpm: beats per minute; CAD: Coronary artery disease; CHF: Congestive heart failure; COPD: Chronic obstructive pulmonary disease; HR: Heart rate, MI: Myocardial Infarction; PVD: Peripheral Vascular Disease.
Extended Data Fig. 2. Baseline Characteristics in…
Extended Data Fig. 2. Baseline Characteristics in the Primary Cohort Training, Development and Test Datasets
Primary cohort sample size was 53,870 individual people. Where data was only available for subgroups of the full cohort, subgroup sample size is denoted by N. Differences in means of continuous variables between 2 groups were compared using two-sample t-test. Differences in means of continuous variables between 3+ groups were compared using one-way ANOVA. Differences in proportions of categorical variables between the 2+ groups were compared using Chi-Squared. Tests of significance were 2 sided. a, b, c: Each subscript letter denotes a subset of dataset categories whose column proportions do not differ significantly from each other at the 0.05 level. Post-hoc analysis was performed using Fisher’s least significant differences to compare means of continuous variables between groups. Abbreviations: SD: Standard deviation; CAD: Coronary artery disease; CHF: Congestive heart failure; COPD: Chronic obstructive pulmonary disease; HR: Heart rate, MI: Myocardial Infarction; PVD: Peripheral Vascular Disease.
Extended Data Fig. 3. Baseline Characteristics of…
Extended Data Fig. 3. Baseline Characteristics of the Primary, Contemporary and Clinic Cohorts
Where data was only available for subgroups of the full cohorts, subgroup sample size is denoted by N. Differences in means of continuous variables between 2 groups were compared using two-sample t-test. Differences in means of continuous variables between 3+ groups were compared using one-way ANOVA. Differences in proportions of categorical variables between the 2+ groups were compared using Chi-Squared. Tests of significance were 2 sided. a, b, c: Each subscript letter denotes a subset of dataset categories whose column proportions do not differ significantly from each other at the 0.05 level. Post-hoc analysis was performed using Fisher’s least significant differences to compare means of continuous variables between groups. Abbreviations: SD: Standard deviation; CAD: Coronary artery disease; CHF: Congestive heart failure; COPD: Chronic obstructive pulmonary disease; HR: Heart rate, MI: Myocardial Infarction; PVD: Peripheral Vascular Disease.
Extended Data Fig. 4. Data Figure 4.…
Extended Data Fig. 4. Data Figure 4. Confusion matrices for DNN performance in 3 validation datasets.
Confusion matrices for the predictions of the DNN in the Test Dataset (a-b), Contemporary Cohort (c-d), and Clinic Cohort (e-f), at both the recording and user-level. Total number of patients are presented in parentheses. The DNN Score cutoff used was 0.427.
Extended Data Fig. 5. DNN performance to…
Extended Data Fig. 5. DNN performance to predict diabetes according to time of day, recording length and heart rate in the Test dataset.
DNN sensitivity, specificity, diagnostic odds-ratio and AUC to detect prevalent diabetes are presented across strata of age, gender and number of recordings. The Test Dataset sample size is 11,313 individuals. Counts are provided in parentheses for all subgroup metrics. The diagnostic odds-ratio is the ratio of positive likelihood ratio (sensitivity / (1–specificity)) to the negative likelihood ratio ((1–sensitivity)/specificity). The diagnostic odds-ratio is presented at the recording-level with the associated 95% confidence interval. Interaction p-values are two-sided Wald tests for interaction between the DNN Score and the respective covariates for diabetes. Abbreviations: DNN: deep neural network; OR: diagnostic odds ratio; AUC: area under the curve; CI: confidence interval; BPM: beats per minute.
Extended Data Fig. 6. Activation maps from…
Extended Data Fig. 6. Activation maps from several hidden convolutional layers of the trained Deep Neural Network (DNN) for one photoplethysmography (PPG) record.
a. An example of a PPG recording which serves as the input into the DNN. b. The activation map of one example filter (out of 16) from the first convolutional layer of the neural network. This activation map is obtained after the example PPG recording is fed into the trained DNN. Each lighter colored band illustrates “activation” of a model parameter. At this early layer of the neural network, the lighter colored bands correspond directly to each cardiac cycle of the PPG waveform. Thicker lines likely indicate morphological features of the waveform. c. Visualization of the activation maps of the 16 filters from the first convolutional layer of the neural network, obtained after the input PPG is fed into the trained DNN. Each of the 16 filters can learn different sets of “features” from the input PPG recording. Filters with more purple bands have more inactive neurons, as compared to those with lighter colors (green being the strongest activation and dark purple being the weakest activation). Six filters appear completely inactivated (all purple), suggesting that the features these filters focus on are not present in this example input PPG. d. Visualization of the activation maps of the 7th convolutional layer of the DNN, comprised of 32 filters. Broadly, these activation maps from the 7th layer of the DNN are more complex compared to those from the 1st layer (b-c), demonstrating how deeper layers of the DNN encode increasingly abstract information representing higher level interactions and complex features.
Extended Data Fig. 7. Activation maps from…
Extended Data Fig. 7. Activation maps from hidden convolutional layers of the trained Deep Neural Network (DNN) for an example photoplethysmography (PPG) recording with artifacts.
a. An example PPG recording with 2 artifacts (blue and orange rectangles) which serves as the input into the DNN. b. Activation maps of the 16 filters from the first convolutional layer of the DNN. Each lighter colored band illustrates “activation” of a model parameter. Orange and blue arrow are placed on filters denoting the location of artifacts, highlighted by orange and blue rectangles (a), respectively. Some filters, such as the 4th image in the top row, seem to not have activation at the location of the artifactual beats (hollow orange and blue arrows), suggesting that the DNN is “ignoring” data from these artifact locations. Whereas other filters are have activation, suggested by lighter color bars, in the locations of the artifacts (full orange and blue arrows), such as the 2nd filter from the left in the top row, suggesting that the DNN is using data from these artifact locations. Some filters, such as the 2nd from the left in the bottom row “ignore” the artifactual beats by having uniform activation throughout the signal length (except where there are artifacts) likely representing the cardiac cycle. These findings suggest that the DNN is able to identify artifactual beats and differentiate them from good quality waveforms.
Extended Data Fig. 8. Example photoplethysmography (PPG)…
Extended Data Fig. 8. Example photoplethysmography (PPG) waveforms.
a. Examples of raw PPG recordings from individuals with and without diabetes (red/green recordings, respectively), which serve as inputs to the deep neural network. DNN Scores predicted for each recording are shown. PPG recordings are either cropped or zero-padded to the same fixed length (~20.3 seconds) before being input into the DNN. The “flat line” in three examples is a demonstration of zero-padding shorter records to the fixed length. DNN: Deep Neural Network; ms: milliseconds.
Extended Data Fig. 9. Deep Neural Network…
Extended Data Fig. 9. Deep Neural Network architecture.
The neural network had 39 layers organized in a block structure, consisting of convolutional layers with an initial filter size of 15 and filter number (N) of 16. The size of the filters decreased, and the number of filters increased as network depth increased, as shown. After each convolutional layer, we applied batch normalization, rectified linear activation and dropout with a probability of 0.2. The final flattened and fully connected softmax layer produced an output distribution across the classes of diabetes/no diabetes. This output distribution is referred to as the DNN Score. PPG: photoplethysmography; DNN: Deep Neural Network; Hz: Hertz.
Figure 1:. Consort diagram describing the study…
Figure 1:. Consort diagram describing the study cohorts and screenshots from the smartphone app used for PPG acquisition
a. Description of the datasets used for algorithm development and validation. The deep neural network (DNN) was trained using the training and development dataset of the Primary Cohort (left), and validated using the test dataset of the Primary Cohort. We additionally validated the DNN in the temporally-distinct Contemporary Cohort (middle) and the prospectively enrolled, in-person Clinic Cohort (right). Blue outlines indicate datasets used for model development and training. Yellow outlines indicate datasets used for model validation. All datasets are completely separate and do not contain overlapping participants. b. Screenshots from the smartphone app used to acquire user-measured PPG recordings using a smartphone app and camera. PPG: photoplethysmography; BPM: beats per minute. DNN: Deep Neural Network; Hz: Hertz.
Figure 2:. Comparison of model performance to…
Figure 2:. Comparison of model performance to detect diabetes in the Test Dataset.
a. Receiver operating characteristic curves for detection of diabetes, as assessed for the DNN score alone or for the output of LogReg Model 5, which includes comorbidities, with and without the DNN Score. This is calculated at either the recording-level, which treats each recording independently, or at the user-level, which is averaged across all recordings of an individual user. The DNN Score cutoff used (0.427) is indicated by a black dot on each curve. Inset: Bar chart showing the area under the receiver operating characteristic curve (AUC) point estimate values for diabetes in the test dataset by the indicated models; 95% confidence intervals are shown as error bars. b. DNN sensitivity, specificity, diagnostic odds-ratio and AUC to detect prevalent diabetes in the Test Dataset, as reported across ranges of age, gender and number of recordings. The Test Dataset sample size is 11,313 individuals. Counts are provided in parentheses for all subgroup metrics. The diagnostic odds-ratio was quantified as the ratio of positive likelihood ratio (sensitivity / (1–specificity)) to the negative likelihood ratio ((1–sensitivity)/specificity), with the associated 95% CI. The diagnostic odds-ratio is presented at the user-level for strata of age, gender and number of recordings. Interaction p-values are two-sided Wald tests between the DNN Score and the respective covariates for diabetes. Abbreviations: DNN: deep neural network; AUC: area under the receiver operating characteristic curve; OR: diagnostic odds ratio; CI: confidence interval.

Source: PubMed

3
Abonnieren