Distinguishing Smoking-Related Lung Disease Phenotypes Via Imaging and Molecular Features

Ehab Billatos, Samuel Y Ash, Fenghai Duan, Ke Xu, Justin Romanoff, Helga Marques, Elizabeth Moses, MeiLan K Han, Elizabeth A Regan, Russell P Bowler, Stefanie E Mason, Tracy J Doyle, Rubén San José Estépar, Ivan O Rosas, James C Ross, Xiaohui Xiao, Hanqiao Liu, Gang Liu, Gauthaman Sukumar, Matthew Wilkerson, Clifton Dalgard, Christopher Stevenson, Duncan Whitney, Denise Aberle, Avrum Spira, Raúl San José Estépar, Marc E Lenburg, George R Washko, DECAMP and COPDGene Investigators, Ehab Billatos, Samuel Y Ash, Fenghai Duan, Ke Xu, Justin Romanoff, Helga Marques, Elizabeth Moses, MeiLan K Han, Elizabeth A Regan, Russell P Bowler, Stefanie E Mason, Tracy J Doyle, Rubén San José Estépar, Ivan O Rosas, James C Ross, Xiaohui Xiao, Hanqiao Liu, Gang Liu, Gauthaman Sukumar, Matthew Wilkerson, Clifton Dalgard, Christopher Stevenson, Duncan Whitney, Denise Aberle, Avrum Spira, Raúl San José Estépar, Marc E Lenburg, George R Washko, DECAMP and COPDGene Investigators

Abstract

Background: Chronic tobacco smoke exposure results in a broad range of lung pathologies including emphysema, airway disease and parenchymal fibrosis as well as a multitude of extra-pulmonary comorbidities. Prior work using CT imaging has identified several clinically relevant subgroups of smoking related lung disease, but these investigations have generally lacked organ specific molecular correlates.

Research question: Can CT imaging be used to identify clinical phenotypes of smoking related lung disease that have specific bronchial epithelial gene expression patterns to better understand disease pathogenesis?

Study design and methods: Using K-means clustering, we clustered participants from the COPDGene study (n = 5,273) based on CT imaging characteristics and then evaluated their clinical phenotypes. These clusters were replicated in the Detection of Early Lung Cancer Among Military Personnel (DECAMP) cohort (n = 360), and were further characterized using bronchial epithelial gene expression.

Results: Three clusters (preserved, interstitial predominant and emphysema predominant) were identified. Compared to the preserved cluster, the interstitial and emphysema clusters had worse lung function, exercise capacity and quality of life. In longitudinal follow-up, individuals from the emphysema group had greater declines in exercise capacity and lung function, more emphysema, more exacerbations, and higher mortality. Similarly, genes involved in inflammatory pathways (tumor necrosis factor-α, interferon-β) are more highly expressed in bronchial epithelial cells from individuals in the emphysema cluster, while genes associated with T-cell related biology are decreased in these samples. Samples from individuals in the interstitial cluster generally had intermediate levels of expression of these genes.

Interpretation: Using quantitative CT imaging, we identified three groups of individuals in older ever-smokers that replicate in two cohorts. Airway gene expression differences between the three groups suggests increased levels of inflammation in the most severe clinical phenotype, possibly mediated by the tumor necrosis factor-α and interferon-β pathways.

Clinical trial registration: COPDGene (NCT00608764), DECAMP-1 (NCT01785342), DECAMP-2 (NCT02504697).

Keywords: COPD; airway gene expression; diagnostic imaging; gene expression; imaging; interferon.

Copyright © 2020 The Authors. Published by Elsevier Inc. All rights reserved.

Figures

Figure 1
Figure 1
A-C, Representative CT images from each of the three cluster phenotypes: A, Preserved. B, Interstitial predominent. C, Emphysema.
Figure 2
Figure 2
Cluster Assignment Using Principal Component Analysis. Overlap of the Detection of Early Lung Cancer Among Military Personnel imaging clusters projected onto the first two principal components of the COPDGene imaging features. DECAMP = Detection of Early Lung Cancer Among Military Personnel; PC1 = principal component 1; PC2 = principal component 2.
Figure 3
Figure 3
A-B, Comparison of Imaging Characteristics of the Clusters in the COPDGene and Detection of Early Lung Cancer Among Military Personnel Cohorts. A, The imaging features used in the identification of the patient clusters in COPDGene were compared between patients assigned to each of the three clusters. B, The same imaging clusters were compared between DECAMP patients assigned to each of the three clusters. Global differences for each imaging feature among the three clusters were assessed using analysis of variance and found to be statistically significantly different (P < .001). Pairwise differences were assessed with the use of t-tests. Two asterisks indicate P ≤ .01; four asterisks indicate P ≤ .0001. ns = not significant at P > .05.
Figure 3
Figure 3
A-B, Comparison of Imaging Characteristics of the Clusters in the COPDGene and Detection of Early Lung Cancer Among Military Personnel Cohorts. A, The imaging features used in the identification of the patient clusters in COPDGene were compared between patients assigned to each of the three clusters. B, The same imaging clusters were compared between DECAMP patients assigned to each of the three clusters. Global differences for each imaging feature among the three clusters were assessed using analysis of variance and found to be statistically significantly different (P < .001). Pairwise differences were assessed with the use of t-tests. Two asterisks indicate P ≤ .01; four asterisks indicate P ≤ .0001. ns = not significant at P > .05.
Figure 4
Figure 4
A-C, Comparison of Clinical Characteristics and Mortality Rates of the Clusters in the COPDGene Cohort. A, The clinical characteristics identified in COPDGene were compared among the three clusters. Global differences for each clinical characteristic among the three clusters were assessed with the use of analysis of variance and found to be statistically significantly different (P < .001). Pairwise differences were assessed with the use of t-tests without adjustment for multiple comparisons. Four asterisks indicate P ≤ .0001. B, The survival rate of the three clusters identified in COPDGene is demonstrated in this Kaplan-Meier curve. Individuals in the emphysema-predominant cluster had the lowest 5-year survival rate; individuals in the preserved cluster had the highest 5-year survival rate. C, The inset table shows the results of multivariable Cox regression analyses that compared the interstitial predominant cluster and emphysema cluster with the preserved cluster. Note that these analyses were adjusted for age, sex, race, smoking status, and FEV1 at baseline. SGRQ = St. George's Respiratory Questionnaire.
Figure 4
Figure 4
A-C, Comparison of Clinical Characteristics and Mortality Rates of the Clusters in the COPDGene Cohort. A, The clinical characteristics identified in COPDGene were compared among the three clusters. Global differences for each clinical characteristic among the three clusters were assessed with the use of analysis of variance and found to be statistically significantly different (P < .001). Pairwise differences were assessed with the use of t-tests without adjustment for multiple comparisons. Four asterisks indicate P ≤ .0001. B, The survival rate of the three clusters identified in COPDGene is demonstrated in this Kaplan-Meier curve. Individuals in the emphysema-predominant cluster had the lowest 5-year survival rate; individuals in the preserved cluster had the highest 5-year survival rate. C, The inset table shows the results of multivariable Cox regression analyses that compared the interstitial predominant cluster and emphysema cluster with the preserved cluster. Note that these analyses were adjusted for age, sex, race, smoking status, and FEV1 at baseline. SGRQ = St. George's Respiratory Questionnaire.
Figure 5
Figure 5
A-B, Peripheral Eosinophilia and Inflammation in the COPDGene Cohort. A, The percent of peripheral WBCs that are eosinophils by imaging cluster. B, C-reactive protein by imaging cluster. Global differences for each biomarker among the three clusters were assessed with the use of analysis of variance and found to be statistically significantly different for C-reactive protein (P = .003) and not significant for eosinophilia (P = .05). Pairwise differences were assessed with the use of t-tests without adjustment for multiple comparisons. One asterisk indicates P ≤ .05; two asterisks indicate P ≤ .01. ns = not significant at P > .05.
Figure 6
Figure 6
A-B, Emphysema Cluster-Related Gene Expression. A, With the use of linear modeling, 41 genes were identified to be differentially expressed between the preserved and emphysema cluster (false discovery rate, P < .001). Post-hoc Tukey’s honestly significant difference test was applied to examine the pairwise differences between groups. ∗P ≤ .05; ∗∗P ≤ .01.
Figure 7
Figure 7
Gene Set Variation Analysis of Emphysema Signature Genes in Peripheral Blood Mononuclear Cells. Following Interferon-β Treatment. Gene set variation analysis was used to summarize the expression of each emphysema signature gene cluster in a published dataset of peripheral blood mononuclear cells from patients at baseline or after interferon-β treatment (GSE2610418). This demonstrates that the genes increased in the bronchial airway of patients in the emphysema-cluster group are induced significantly in peripheral blood mononuclear cells from patients with multiple sclerosis who are treated with interferon-β. Post hoc Tukey’s honestly significant difference test was applied to examine the pairwise differences between groups. Two asterisks indicate P ≤ .01. GSVA = gene set variation analysis.

Source: PubMed

3
S'abonner