A whole blood gene expression-based signature for smoking status

Philip Beineke, Karen Fitch, Heng Tao, Michael R Elashoff, Steven Rosenberg, William E Kraus, James A Wingrove, PREDICT Investigators, Philip Beineke, Karen Fitch, Heng Tao, Michael R Elashoff, Steven Rosenberg, William E Kraus, James A Wingrove, PREDICT Investigators

Abstract

Background: Smoking is the leading cause of preventable death worldwide and has been shown to increase the risk of multiple diseases including coronary artery disease (CAD). We sought to identify genes whose levels of expression in whole blood correlate with self-reported smoking status.

Methods: Microarrays were used to identify gene expression changes in whole blood which correlated with self-reported smoking status; a set of significant genes from the microarray analysis were validated by qRT-PCR in an independent set of subjects. Stepwise forward logistic regression was performed using the qRT-PCR data to create a predictive model whose performance was validated in an independent set of subjects and compared to cotinine, a nicotine metabolite.

Results: Microarray analysis of whole blood RNA from 209 PREDICT subjects (41 current smokers, 4 quit ≤ 2 months, 64 quit > 2 months, 100 never smoked; NCT00500617) identified 4214 genes significantly correlated with self-reported smoking status. qRT-PCR was performed on 1,071 PREDICT subjects across 256 microarray genes significantly correlated with smoking or CAD. A five gene (CLDND1, LRRN3, MUC1, GOPC, LEF1) predictive model, derived from the qRT-PCR data using stepwise forward logistic regression, had a cross-validated mean AUC of 0.93 (sensitivity=0.78; specificity=0.95), and was validated using 180 independent PREDICT subjects (AUC=0.82, CI 0.69-0.94; sensitivity=0.63; specificity=0.94). Plasma from the 180 validation subjects was used to assess levels of cotinine; a model using a threshold of 10 ng/ml cotinine resulted in an AUC of 0.89 (CI 0.81-0.97; sensitivity=0.81; specificity=0.97; kappa with expression model = 0.53).

Conclusion: We have constructed and validated a whole blood gene expression score for the evaluation of smoking status, demonstrating that clinical and environmental factors contributing to cardiovascular disease risk can be assessed by gene expression.

Figures

Figure 1
Figure 1
Gene ontology analysis of 4214 array genes associated with smoking. The 4214 smoking-associated genes were analyzed using BINGO to identify significant biological processes. Significant processes (p < 0.001 after FDR correction) are colored with the gradient of p values reflected in the colors as indicated, and the biological process annotated. (A) Cellular component ontological terms (B) Biological Process ontological terms.
Figure 2
Figure 2
Hierarchical clustering of 209 subjects and 227 array genes associated with smoking (p < 0.001). The dendogram on top shows correlations between subjects; black bars at bottom denote current smokers; red bars denote recently quit smokers. Dendogram on the left shows correlations between genes; positions of representative cell-specific genes are shown on the right.
Figure 3
Figure 3
Expression levels of four most significant genes as assessed by qRT-PCR across 1074 PREDICT subjects grouped by self-reported smoking status. Expression levels are shown in Cp units on the Y axis, self-reported smoking status is shown on the X axis. (A) LRRN3; (B) CLDND1; (C) SASH1; (D) P2RY6.
Figure 4
Figure 4
Comparison of gene expression score to cotinine levels in validation set. The y-axis shows the log10 value of cotinine levels in the 180 subject validation set; the horizontal dashed line (−−-) denotes the 10ng/ml threshold used in the AUC analysis. The x-axis shows the GES in the 180 subject validation set; the vertical dashed line denotes the 50% probability threshold used in the AUC analysis. Black circles = non-smokers; red circles = former smokers (> 2 months quit); green circles = recently quit smokers (< 2 months quit); blue circles = current smokers. All smoking categories are self-reported.

References

    1. Mathers CD, Loncar D. Projections of global mortality and burden of disease from 2002 to 2030. PLoS Med. 2006;3(11):e442. doi: 10.1371/journal.pmed.0030442.
    1. Fagerstrom K. The epidemiology of smoking: health consequences and benefits of cessation. Drugs. 2002;62(Suppl 2):1–9.
    1. Ambrose JA, Barua RS. The pathophysiology of cigarette smoking and cardiovascular disease: an update. J Am Coll Cardiol. 2004;43(10):1731–1737. doi: 10.1016/j.jacc.2003.12.047.
    1. Baechler EC, Batliwalla FM, Karypis G, Gaffney PM, Ortmann WA, Espe KJ, Shark KB, Grande WJ, Hughes KM, Kapur V. et al.Interferon-inducible gene expression signature in peripheral blood cells of patients with severe lupus. Proc Natl Acad Sci U S A. 2003;100(5):2610–2615. doi: 10.1073/pnas.0337679100.
    1. Bennett L, Palucka AK, Arce E, Cantrell V, Borvak J, Banchereau J, Pascual V. Interferon and granulopoiesis signatures in systemic lupus erythematosus blood. J Exp Med. 2003;197(6):711–723. doi: 10.1084/jem.20021553.
    1. Deng MC, Eisen HJ, Mehra MR, Billingham M, Marboe CC, Berry G, Kobashigawa J, Johnson FL, Starling RC, Murali S. et al.Noninvasive discrimination of rejection in cardiac allograft recipients using gene expression profiling. Am J Transplant. 2006;6(1):150–160. doi: 10.1111/j.1600-6143.2005.01175.x.
    1. Rosenberg S, Elashoff MR, Beineke P, Daniels SE, Wingrove JA, Tingley WG, Sager PT, Sehnert AJ, Yau M, Kraus WE. et al.Multicenter validation of the diagnostic accuracy of a blood-based gene expression test for assessing obstructive coronary artery disease in nondiabetic patients. Ann Intern Med. 2010;153(7):425–434.
    1. Wingrove JA, Daniels SE, Sehnert AJ, Tingley W, Elashoff MR, Rosenberg S, Buellesfeld L, Grube E, Newby LK, Ginsburg GS. et al.Correlation of peripheral-blood gene expression with the extent of coronary artery stenosis. Circ Cardiovasc Genet. 2008;1(1):31–38. doi: 10.1161/CIRCGENETICS.108.782730.
    1. Elashoff MR, Wingrove JA, Beineke P, Daniels SE, Tingley WG, Rosenberg S, Voros S, Kraus WE, Ginsburg GS, Schwartz RS. et al.Development of a blood-based gene expression algorithm for assessment of obstructive coronary artery disease in non-diabetic patients. BMC Med Genomics. 2011;4(1):26. doi: 10.1186/1755-8794-4-26.
    1. Yang LH, Thorne NP. In: Statistics and Science: a Festschrift for Terry Speed, Volume 40. Goldstein DR, editor. Beachwood, OH: Institute of Mathematical Statistics; 2003. Normalization for Two-color cDNA microarray data; pp. 403–418.
    1. Maere S, Heymans K, Kuiper M. BiNGO: a cytoscape plugin to assess overrepresentation of gene ontology categories in biological networks. Bioinformatics. 2005;21(16):3448–3449. doi: 10.1093/bioinformatics/bti551.
    1. de Hoon MJ, Imoto S, Nolan J, Miyano S. Open source clustering software. Bioinformatics. 2004;20(9):1453–1454. doi: 10.1093/bioinformatics/bth078.
    1. Saldanha AJ. Java treeview–extensible visualization of microarray data. Bioinformatics. 2004;20(17):3246–3248. doi: 10.1093/bioinformatics/bth349.
    1. Wu C, Orozco C, Boyer J, Leglise M, Goodale J, Batalov S, Hodge CL, Haase J, Janes J, Huss JW 3rd. et al.BioGPS: an extensible and customizable portal for querying and organizing gene annotation resources. Genome Biol. 2009;10(11):R130. doi: 10.1186/gb-2009-10-11-r130.
    1. Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, Paulovich A, Pomeroy SL, Golub TR, Lander ES. et al.Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci U S A. 2005;102(43):15545–15550. doi: 10.1073/pnas.0506580102.
    1. Benowitz NL, Hukkanen J, Jacob P. 3rd: Nicotine chemistry, metabolism, kinetics and biomarkers. Handb Exp Pharmacol. 2009;192:29–60. doi: 10.1007/978-3-540-69248-5_2.
    1. Wang Z, Neuberg D, Su L, Kim JY, Chen JC, Christiani DC. Prospective study of metal fume-induced responses of global gene expression profiling in whole blood. Inhal Toxicol. 2008;20(14):1233–1244. doi: 10.1080/08958370802192874.
    1. Charlesworth JC, Curran JE, Johnson MP, Goring HH, Dyer TD, Diego VP, Kent JW Jr, Mahaney MC, Almasy L, MacCluer JW. et al.Transcriptomic epidemiology of smoking: the effect of smoking on gene expression in lymphocytes. BMC Med Genomics. 2010;3:29. doi: 10.1186/1755-8794-3-29.
    1. Doyle I, Ratcliffe M, Walding A, Vanden Bon E, Dymond M, Tomlinson W, Tilley D, Shelton P, Dougall I. Differential gene expression analysis in human monocyte-derived macrophages: impact of cigarette smoke on host defence. Mol Immunol. 2010;47(5):1058–1065. doi: 10.1016/j.molimm.2009.11.008.
    1. Liu Y, Sun W, Zhang K, Zheng H, Ma Y, Lin D, Zhang X, Feng L, Lei W, Zhang Z. et al.Identification of genes differentially expressed in human primary lung squamous cell carcinoma. Lung Cancer. 2007;56(3):307–317. doi: 10.1016/j.lungcan.2007.01.016.
    1. Woenckhaus M, Merk J, Stoehr R, Schaeper F, Gaumann A, Wiebe K, Hartmann A, Hofstaedter F, Dietmaier W. Prognostic value of FHIT, CTNNB1, and MUC1 expression in non-small cell lung cancer. Hum Pathol. 2008;39(1):126–136. doi: 10.1016/j.humpath.2007.05.027.
    1. Cheng J, Cebotaru V, Cebotaru L, Guggino WB. Syntaxin 6 and CAL mediate the degradation of the cystic fibrosis transmembrane conductance regulator. Mol Biol Cell. 2010;21(7):1178–1187. doi: 10.1091/mbc.E09-03-0229.
    1. Mao CD, Byers SW. Cell-context dependent TCF/LEF expression and function: alternative tales of repression, de-repression and activation potentials. Crit Rev Eukaryot Gene Expr. 2011;21(3):207–236. doi: 10.1615/CritRevEukarGeneExpr.v21.i3.10.

Source: PubMed

3
Abonnieren