Unsupervised phenotyping of Severe Asthma Research Program participants using expanded lung data

Wei Wu, Eugene Bleecker, Wendy Moore, William W Busse, Mario Castro, Kian Fan Chung, William J Calhoun, Serpil Erzurum, Benjamin Gaston, Elliot Israel, Douglas Curran-Everett, Sally E Wenzel, Wei Wu, Eugene Bleecker, Wendy Moore, William W Busse, Mario Castro, Kian Fan Chung, William J Calhoun, Serpil Erzurum, Benjamin Gaston, Elliot Israel, Douglas Curran-Everett, Sally E Wenzel

Abstract

Background: Previous studies have identified asthma phenotypes based on small numbers of clinical, physiologic, or inflammatory characteristics. However, no studies have used a wide range of variables using machine learning approaches.

Objectives: We sought to identify subphenotypes of asthma by using blood, bronchoscopic, exhaled nitric oxide, and clinical data from the Severe Asthma Research Program with unsupervised clustering and then characterize them by using supervised learning approaches.

Methods: Unsupervised clustering approaches were applied to 112 clinical, physiologic, and inflammatory variables from 378 subjects. Variable selection and supervised learning techniques were used to select relevant and nonredundant variables and address their predictive values, as well as the predictive value of the full variable set.

Results: Ten variable clusters and 6 subject clusters were identified, which differed and overlapped with previous clusters. Patients with traditionally defined severe asthma were distributed through subject clusters 3 to 6. Cluster 4 identified patients with early-onset allergic asthma with low lung function and eosinophilic inflammation. Patients with later-onset, mostly severe asthma with nasal polyps and eosinophilia characterized cluster 5. Cluster 6 asthmatic patients manifested persistent inflammation in blood and bronchoalveolar lavage fluid and exacerbations despite high systemic corticosteroid use and side effects. Age of asthma onset, quality of life, symptoms, medications, and health care use were some of the 51 nonredundant variables distinguishing subject clusters. These 51 variables classified test cases with 88% accuracy compared with 93% accuracy with all 112 variables.

Conclusion: The unsupervised machine learning approaches used here provide unique insights into disease, confirming other approaches while revealing novel additional phenotypes.

Keywords: Asthma phenotyping; supervised machine learning approaches; unsupervised approaches; variable analysis.

Copyright © 2014 American Academy of Allergy, Asthma & Immunology. Published by Mosby, Inc. All rights reserved.

Figures

Figure 1
Figure 1
Asthma symptom and Quality of Life variables differed by subject clusters. (A–C) Cluster 6 had the highest shortness of breath frequency (A), the highest cough frequency (B), and the lowest AQLQ activity limitation score (C), among all clusters. The intergroup differences for all variables can be found in Table E3.
Figure 2
Figure 2
Age of asthma onset and allergy differed by subject clusters according to asthma disease status. (A–B) Cluster 5 had later onset (A) and lower numbers of allergen skin reactions (B) than all other clusters. (C) Cluster 2 had less allergy symptoms in winter than Clusters 4–6. (D) Clusters 2–6 have more asthma symptoms caused by animal exposure than Cluster 1.
Figure 3
Figure 3
Health care utilization differed by subject clusters. (A) Subjects in Clusters 3–6 were more likely to have seen a doctor in last 12 months for asthma than Cluster 2. (B–C) Cluster 6 had a higher proportion of subjects visited ER for breathing in the last year (B) and number of ICU admissions for asthma (C) than all other clusters.
Figure 4
Figure 4
Corticosteroid use, treatment consequences and associated clinical characteristics differ across subject clusters. (A–C) Cluster 6 had the higher proportion of subjects with >3 oral CS bursts in previous year (A), on oral CS (B), and osteoporosis (C) than all other asthma clusters. (D–E) Cluster 5 had the highest proportion of subjects with nasal polyps removed (D) and had sinusitis (E).
Figure 5
Figure 5
Airway responsiveness and Th2-inflammatory markers differ across the subject clusters. (A) Prebronchodilator FEV1/FVC was lower in Cluster 6 than all other clusters. (B) There were no differences in reversibility across the asthma clusters. (C) Cluster 6 had higher FENO compared to all other clusters. (D) Clusters 2, 4 and 5 had higher blood eosinophil numbers than Cluster 1.

Source: PubMed

3
S'abonner