Pharmacologically informed machine learning approach for identifying pathological states of unconsciousness via resting-state fMRI

Justin M Campbell, Zirui Huang, Jun Zhang, Xuehai Wu, Pengmin Qin, Georg Northoff, George A Mashour, Anthony G Hudetz, Justin M Campbell, Zirui Huang, Jun Zhang, Xuehai Wu, Pengmin Qin, Georg Northoff, George A Mashour, Anthony G Hudetz

Abstract

Determining the level of consciousness in patients with disorders of consciousness (DOC) remains challenging. To address this challenge, resting-state fMRI (rs-fMRI) has been widely used for detecting the local, regional, and network activity differences between DOC patients and healthy controls. Although substantial progress has been made towards this endeavor, the identification of robust rs-fMRI-based biomarkers for level of consciousness is still lacking. Recent developments in machine learning show promise as a tool to augment the discrimination between different states of consciousness in clinical practice. Here, we investigated whether machine learning models trained to make a binary distinction between conscious wakefulness and anesthetic-induced unconsciousness would then be capable of reliably identifying pathologically induced unconsciousness. We did so by extracting rs-fMRI-based features associated with local activity, regional homogeneity, and interregional functional activity in 44 subjects during wakefulness, light sedation, and unresponsiveness (deep sedation and general anesthesia), and subsequently using those features to train three distinct candidate machine learning classifiers: support vector machine, Extra Trees, artificial neural network. First, we show that all three classifiers achieve reliable performance within-dataset (via nested cross-validation), with a mean area under the receiver operating characteristic curve (AUC) of 0.95, 0.92, and 0.94, respectively. Additionally, we observed comparable cross-dataset performance (making predictions on the DOC data) as the anesthesia-trained classifiers demonstrated a consistent ability to discriminate between unresponsive wakefulness syndrome (UWS/VS) patients and healthy controls with mean AUC's of 0.99, 0.94, 0.98, respectively. Lastly, we explored the potential of applying the aforementioned classifiers towards discriminating intermediate states of consciousness, specifically, subjects under light anesthetic sedation and patients diagnosed as having a minimally conscious state (MCS). Our findings demonstrate that machine learning classifiers trained on rs-fMRI features derived from participants under anesthesia have potential to aid the discrimination between degrees of pathological unconsciousness in clinical patients.

Keywords: Anesthesia; Consciousness; Deep learning; Disorders of consciousness; Functional connectivity; Machine learning; Resting-state; fMRI.

Conflict of interest statement

We declare no conflict of interest for all authors.

Copyright © 2019 Elsevier Inc. All rights reserved.

Figures

Fig 1.
Fig 1.
Summary of the different behavioral responsiveness assessments used across the three included datasets. (a) The Ramsay scale (here shown as 1/Ramsay score to facilitate comparison) was applied in the Anesthesia-SHH dataset, (b) the Observer’s Assessment of Alertness/Sedation (OAAS) scale was applied in the Anesthesia-WI dataset, and the Coma Recover Scale-Revised (CRS-R) was applied in the DOC dataset.
Fig 2.
Fig 2.
Extraction of model features using fMRI-based measures of resting state activity. (a) Node template representing anatomical location of 226 seed regions of interest (ROIs) consolidated into 10 networks (Power et al., 2011): subcortical (Sub), ventral attention (VA), frontoparietal task control (FPTC), salience (Sal), auditory (Audi), dorsal attention (DA), default mode (DMN), cinguloopercular task control (COTC), sensory/somatomotor (SS), visual (Visual). (b) Raw functional connectivity map (left) generated from seed-based pairwise Pearson correlations between 226 ROIs. Activity was averaged according to network template yielding measures of between network (off-diagonal) and within network (on-diagonal) functional connectivity (middle). Two additional measures of functional segregation, the amplitude of low-frequency fluctuations (ALFF) and regional homogeneity (ReHo), were calculated independently using the network templates.
Fig 3.
Fig 3.
Schematic representation of the three types of supervised machine learning models used in the study. (a) The Support vector machine (SVM) is a discriminative model that generates a hyperplane (i.e., decision boundary) which maximizes the separation between two classes in N-dimensional space (N = number of features). The hyperplane is defined by support vectors, the samples which lay at the boundary between classes. (b) Decision tree-based models apply a flowchart-like approach to classification wherein the input data is repeatedly split into smaller sub-groups according to some decision process until a terminal node (i.e., label) is reached. Shown is a subtype of the decision-trees class, the Random Forest, which generates many different trees from a random sample of the data, and uses bootstrap aggregation (i.e., bagging) to average the predictions across all trees. (c) Artificial Neural Networks (ANNs) represent a broad category of machine learning models which loosely imitate the physical structure of the brain. The networks are composed of individual nodes (neurons), arranged in a hierarchical structure; shown is one possible network structure, with a single input layer, two densely-connected hidden layers, an output layer with one node for each class, and only feed-forward connections throughout.
Fig 4.
Fig 4.
Single feature comparisons between awake and deep sedation groups across anesthesia datasets. (a) Distribution of values for ALFF (upper-left), ReHo (upper-middle), within network FC (upper-right), and between network FC (bottom). * indicates Bonferroni-corrected p < 0.05. The ability of single features to discriminate between the two groups was evaluated using a univariate model-free analysis. The within-dataset (Anesthesia → Anesthesia; blue) and cross-dataset (Anesthesia → DOC; pink) AUC is listed above the features with significant group differences.
Fig 5.
Fig 5.
Single feature comparisons between healthy controls and UWS/VS groups within DOC dataset. (a) Distribution of values for ALFF (upper-left), ReHo (upper-middle), within network FC (upper-right), and between network FC (bottom). * indicates Bonferroni-corrected p < 0.05.
Fig 6.
Fig 6.
A receiver operating characteristic (ROC) curve, which plots a classifier’s true positive rate against the false positive rate, was calculated for each feature independently, both for within dataset classification (Anesthesia →Anesthesia; blue) and cross-dataset classification (Anesthesia → DOC; pink). The univariate ROC curves were subsequently averaged to yield a representative univariate ROC curve within each of the four analyses of functional connectivity. The representative ROC curve was used to determine the area under the curve (AUC), which served as the quantitative measure of univariate classifier performance. The dashed line represents chance-level performance. Shaded areas represent ± 1SD.
Fig 7.
Fig 7.
Support vector machine (SVM), Extra Trees (ET), and artificial neural network (ANN) performance without hyperparameter optimization or feature selection (Default), with feature pruning only (Pruned), and with hyperparameter optimization only (Optimized). (a,b,c) Within-dataset reliability (Anesthesia → Anesthesia) for each model was evaluated using 100×5 nested cross-validation. (d,e,f) Cross-dataset generalizability (Anesthesia → DOC) was evaluated by testing the fully-trained models on 100 bootstrap samples of the DOC data. The solid lines represent the mean ROC’s across 100 evaluations. Shaded areas represent ± 1SD. The dashed line represents chance-level performance (AUC = 0.50). * indicates Bonferroni-corrected p < 0.05.
Fig 8.
Fig 8.
Computational stress tests and analysis of feature importance. (a) Variable fractions of the functional connectivity features (0%−100%) were randomly dropped (zeroed) in the test dataset. The effect of random dropping was quantified using a mean area under the curve (AUC) analysis across 100 bootstrap samples of the DOC data before and after removal. (b) Performance across variable signal-to-noise ratios (1/1–1/100) was quantified using the previously described DOC sampling and testing procedure. Dotted line represents chance-level performance (AUC=0.50). Shaded areas represent ± 1 SD.
Fig 9.
Fig 9.
Exploratory post-hoc analysis of feature importance for the optimized support vector machine (SVM) and Extra Trees (ET) models. (a) Since the optimized SVM was linear, feature importance was quantified by squaring the weights of the coefficients used by the model. (b) Within the ET model, feature importance corresponded to how much each feature decreased the Gini impurity. Across both models, larger values (red) are associated with higher feature importance relative to features with lower values (blue).
Fig 10.
Fig 10.
Class assignment probability across models for subjects not included in the training data, from left to right: light anesthetic sedation (Light), recovery from anesthetic sedation (Rec), UWS/VS, MCS, healthy controls (HC) from the DOC dataset. Models were trained on the anesthesia datasets, such that 0 mapped to an unresponsive state and 1 mapped to an awake state. The predicted classification probabilities for each group were compared to binary decision threshold set at 0.5 to identify groups reliably classified as awake or unresponsive. A secondary analysis was performed to identify differences between the MCS and UWS/VS groups, the MCS and Wake groups, and the Light and Rec groups. * indicates uncorrected p
All figures (10)

Source: PubMed

3
Abonneren