Molecular-based diagnosis of multiple sclerosis and its progressive stage

Christopher Barbour, Peter Kosa, Mika Komori, Makoto Tanigawa, Ruturaj Masvekar, Tianxia Wu, Kory Johnson, Panagiotis Douvaras, Valentina Fossati, Ronald Herbst, Yue Wang, Keith Tan, Mark Greenwood, Bibiana Bielekova, Christopher Barbour, Peter Kosa, Mika Komori, Makoto Tanigawa, Ruturaj Masvekar, Tianxia Wu, Kory Johnson, Panagiotis Douvaras, Valentina Fossati, Ronald Herbst, Yue Wang, Keith Tan, Mark Greenwood, Bibiana Bielekova

Abstract

Objective: Biomarkers aid diagnosis, allow inexpensive screening of therapies, and guide selection of patient-specific therapeutic regimens in most internal medicine disciplines. In contrast, neurology lacks validated measurements of the physiological status, or dysfunction(s) of cells of the central nervous system (CNS). Accordingly, patients with chronic neurological diseases are often treated with a single disease-modifying therapy without understanding patient-specific drivers of disability. Therefore, using multiple sclerosis (MS) as an example of a complex polygenic neurological disease, we sought to determine whether cerebrospinal fluid (CSF) biomarkers are intraindividually stable, cell type-, disease- and/or process-specific, and responsive to therapeutic intervention.

Methods: We used statistical learning in a modeling cohort (n = 225) to develop diagnostic classifiers from DNA-aptamer-based measurements of 1,128 CSF proteins. An independent validation cohort (n = 85) assessed the reliability of derived classifiers. The biological interpretation resulted from in vitro modeling of primary or stem cell-derived human CNS cells and cell lines.

Results: The classifier that differentiates MS from CNS diseases that mimic MS clinically, pathophysiologically, and on imaging achieved a validated area under the receiver operating characteristic curve (AUROC) of 0.98, whereas the classifier that differentiates relapsing-remitting from progressive MS achieved a validated AUROC of 0.91. No classifiers could differentiate primary progressive from secondary progressive MS better than random guessing. Treatment-induced changes in biomarkers greatly exceeded intraindividual and technical variabilities of the assay.

Interpretation: CNS biological processes reflected by CSF biomarkers are robust, stable, disease specific, or even disease stage specific. This opens opportunities for broad utilization of CSF biomarkers in drug development and precision medicine for CNS disorders. Ann Neurol 2017;82:795-812.

Conflict of interest statement

Potential conflicts of interest: BB, PK, MK, CB, and MG are co-inventors of US patent application number 62/038,530: Biomarkers for Diagnosis and Management of Neuro-immunological Diseases, which pertains to the results of this paper. BB, PK, and MK have assigned their patent rights to the US Department of Health and Human Services.

Published 2017. This article is a US Government work and is in the public domain in the USA.

Figures

Figure 1. SNR calculation and differences in…
Figure 1. SNR calculation and differences in technical and biological replicates versus pre- and post-treatment samples
(A) Graphical example of technical (top graphs) and biological (bottom graphs) variance calculation for SOMAmer SL004672. The x-axes correspond to the patient/sample that the measurements were produced from. The upper panels show technical replicates (n=88) and the lower panels show biological replicates (n=24). The left panels show the raw measurements (natural-log scale RFU) for SOMAmer SL004672 for each sample on the y-axes. The right panels show the identical raw observations with the random intercept effect subtracted to account for subject-to-subject variation. These residuals (after subtracting means of technical or biological replicates) were used to estimate the technical and biological variance, respectively. The horizontal black line is the estimated mean from the technical (top graphs) and biological (bottom graphs) variance models. (B) Differences in biomarker measurements in identical samples analyzed repeatedly (technical replicates, n=88, left), in longitudinal HD samples measured at different time points (biological replicates, n=24, middle), and in patient samples before and after application of immunomodulatory therapy (biological changes, n=10, right) were quantified in two ways: (i) An average of Spearman rho values calculated across 500 high-signaling SOMAmers with high SNR, and (ii) an average of variabilities (a median of relative percent changes calculated as absolute difference of RFUs for each of the 500 high-signaling SOMAmers between two replicates divided by the average of the two RFUs) for all pairs of replicates in each respective category. Examples of the strongest and the weakest correlations between two samples in each category are visualized on 500 high-signaling SOMAmers. The axes show log10 scales of relative fluorescent units (RFU) of SOMAmers.
Figure 2. Highly simplified artificial example explaining…
Figure 2. Highly simplified artificial example explaining the principles of random forests
(A) A decision tree differentiates groups of observations (elements) using selected features. E.g., to differentiate RRMS from progressive MS useful features may be MRI contrast-enhancing lesions (CELs) and T2 lesions, IgG index, and Disability. (B) Assembling features into a decision tree provides better results than classifying based on any single feature. A decision tree algorithm selects from available features one that best differentiates diagnostic categories and computes its optimal threshold (i.e., the value on which to split the elements). The algorithm then finds the next best feature to split the categories, and this process repeats itself until meeting termination criterion; e.g., when a certain number of splits has occurred. The number of splits corresponds to the depth of a tree. A random forest algorithm mitigates problem of unreliable predictions caused by overfitting. A random forest is a collection of decision trees, each generated slightly differently, using a random subset of features and elements. First, the algorithm restricts the number of features from which each new tree is constructed: if testing p features, the algorithm randomly selects p features available for every split in a tree. Second, each tree is constructed from a random sample of patients (with replacement) of the same size as the original training cohort (bootstrapping). The observations withheld from each tree due to bootstrapping are used to calculate out of bag (OOB) misclassification error. In our example (C) only 4=2 features are used for each split in the decision trees of depth 2, with each of the decision trees generated from a bootstrapped subset of the training dataset. Panel C illustrates possible partitions for CELs-T2 lesions (upper) and CELs-Disability (lower) combinations of features, while panel (D) represents corresponding examples of decision trees, with a total of four trees in the forest. The final prediction is derived as an average prediction from all randomly generated trees. For example, if one tree classified a patient as progressive MS but other 3 trees classified the patient as RRMS, the subject will be classified as RRMS with 75% probability. Because of the high variability in individual trees, the algorithm is typically run for many trees until the OOB predictions stabilize. Therefore it cannot be described by a mathematical equation or a single decision tree. The randomness assures that the algorithm searches the entire p-dimensional partition space (E; only 3 dimensions shown, but the search space is 4-dimensional) for the best features, and by averaging the partitioning thresholds in the training cohort, the classifier also effectively derives optimal global thresholds. By calculating the average OOB error when a feature is omitted from the construction of a tree, we can generate global “variable importance” metric (F) that reflect decrease in accuracy of the random forest classifier in the absence of the specific feature.
Figure 3. Schematic diagram of the SOMAscan…
Figure 3. Schematic diagram of the SOMAscan analysis leading to molecular diagnostic tools
The SOMAscan assay comprises 1128 SOMAmers (solid black line curves). Calculation of technical and biological SNR reduced the number of SOMAmers considered for further analysis to 500 (dashed red line curves). Using the 500 high-signaling SOMAmers, 124750 biomarker ratios were generated that were subsequently tested for their SNR and their ability to differentiate two diagnostic groups (based on AUC in the modeling cohort), resulting in 5401 high-signaling biomarker ratios for MS versus non-MS diagnostic test, 3626 biomarker ratios for progressive versus RRMS diagnostic test, and 1504 biomarker ratios for SPMS versus PPMS diagnostic test (green dotted line curves). Out-of-bag AUC estimates (bottom graphs) examined from different random forests generated by sequentially adding ratios with the highest variable importance led to a logical cut-off (marked by solid red line) of 22 SOMAmer ratios for MS versus non-MS diagnostic comparison, 21 SOMAmer ratios for RRMS versus progressive MS diagnostic comparison, and 33 SOMAmer ratios for SPMS versus PPMS (blue dash-dot line curves). Restriction of SOMAmer ratios to the most important ones resulted in validated AUROC =0.98 (MS versus non-MS, CI: 0.94–1.00), AUROC=0.91 (RRMS versus progressive MS, CI: 0.80–1.00), and AUROC=0.58 (SPMS versus PPMS, CI: 0.37–0.79). CI – 95% confidence interval, ER – error rate.
Figure 4. STARD (Standards for Reporting Diagnostic…
Figure 4. STARD (Standards for Reporting Diagnostic accuracy studies) diagrams and confusion matrices reporting the flow of subjects used for validation of the molecular diagnostic test
(A) STARD diagram and (C) confusion matrix for 85 subjects used for validation of the MS molecular diagnostic test and (B) STARD diagram and (D) confusion matrix for 47 subjects used for validation of the progressive MS molecular diagnostic test.
Figure 5. MS versus non-MS molecular diagnostic…
Figure 5. MS versus non-MS molecular diagnostic test
A parallel coordinate plot (PCP) for the 22 most important features that distinguish MS from non-MS. The plot displays individual patients from combined modeling (n=225) and validation cohort (n=85) divided into MS group (RRMS, PPMS, SPMS; thin red lines) and non-MS group (HD, NIND, OIND; thin blue lines). A group average is shown as thick yellow line for PPMS, thick red line for RRMS, thick orange line for SPMS, thick purple line for HD, thick green line for NIND, thick blue line for OIND. The y-axis shows SOMAmer natural log ratios scaled to 0–1 range. SOMAmer ratios were grouped based on the cellular origin and known functions of the individual components into nine groups. Different cell types are shown above the PCP to highlight the cell origin of individual SOMAmers. MS patients show higher plasma cell/plasmablast activation/levels compared to overall intrathecal inflammation (group 1a), myeloid lineage (groups 1b and 1c), epithelial damage (group 1d), CNS destruction and epithelial injury (group 2a), differences in immunoglobulin subtypes (group 2b), CNS and endothelial damage (group 2c), astrocyte activation (group 2d) and higher epithelial injury compared to neutrophil activation (group 3). *ratios in the classifier are inverted.
Figure 6. Progressive vs RRMS molecular diagnostic…
Figure 6. Progressive vs RRMS molecular diagnostic test
A parallel coordinate plot (PCP) for the 21 most important variables that distinguish progressive MS from relapsing MS. The plot displays individual MS patients from combined modeling (n=120) and validation cohort (n=47) divided into progressive MS group (PPMS and SPMS; thin blue lines) and relapsing MS group (RRMS; thin red lines). A group average is shown as thick purple line for PPMS, thick blue line for SPMS, and thick red line for RRMS. The y-axis shows SOMAmer natural log ratios scaled to 0–1 range. SOMAmer ratios were grouped based on the cellular origin and known functions of the individual components into nine groups. Different cell types are shown above the PCP to highlight the cell origin of individual SOMAmers. Progressive MS patients show increased loss of neuronal, oligodendroglial, astrocytic, and neuroprotective markers (groups 1a and 1b), proportional loss of oligodendroglial marker compared to myeloid lineage and epithelial marker (group 1c), increased epithelial injury in comparison to overall immune activation (1d), enhanced complement activation (groups 2a and 2b), dysregulation of pathways linked to formation of tertiary lymphoid follicles (groups 2c, 2d, 2e, 3) and to platelet aggregation (3). *ratios in the classifier are inverted.
Figure 7. Comparison of the performance of…
Figure 7. Comparison of the performance of clinical and molecular diagnostic tests
(A) The MS diagnostic probability (on the y-axis) of 85 subjects from the validation cohort is shown in the graph. Blue circles represent subjects with original non-MS diagnosis (HD, OIND, NIND) and orange circles represent subjects with original MS diagnosis (RRMS, PPMS, SPMS). The red line represents an arbitrary cut-off at 50%. The pink background marks an area between 30% and 70% where the certainty of the molecular classification is weak (contains 22.4% of the validation cohort’s subjects). The orange background highlights 70.0% of the validation cohort’s MS subjects with highly probable MS molecular diagnosis (>70%) and the blue background labels 86.8% of the validation cohort’s non-MS subjects with high probability of non-MS molecular diagnosis (<30%). The gray bars represent a frequency distribution bar chart with the bin size of 5%. (B) Misdiagnosed subjects (pink circles) were evaluated for non-SOMAlogic biomarkers of inflammation: IgG index, BCMA, sCD27, and CHI3L1 using alternative assays (for details on methodology, see (35)). The group medians are shown for MS subjects as an orange line and for non-MS subjects as a blue line. The seven MS subjects that were classified as non-MS by the molecular diagnostic test show a non-inflammatory type of disease, whereas the two non-MS (OIND) subjects that were categorized as MS according to the SOMAlogic MS molecular classifier show significant levels of inflammatory markers, overlapping with MS. (C) Comparison of IgG index data (left) and molecular MS diagnostic probability (right) in the combined modeling and validation cohort shows distributions of non-MS (blue circles) and MS subjects (orange circles) (D) Separation of RRMS (green circles) and progressive MS (PMS, purple circles) subjects into two age-categories (<45 years; left side, and >45 years; right side) shows that age does not affect performance of the progressive MS classifier.

Source: PubMed

3
Abonnieren