Development of a blood-based gene expression algorithm for assessment of obstructive coronary artery disease in non-diabetic patients

Michael R Elashoff, James A Wingrove, Philip Beineke, Susan E Daniels, Whittemore G Tingley, Steven Rosenberg, Szilard Voros, William E Kraus, Geoffrey S Ginsburg, Robert S Schwartz, Stephen G Ellis, Naheem Tahirkheli, Ron Waksman, John McPherson, Alexandra J Lansky, Eric J Topol, Michael R Elashoff, James A Wingrove, Philip Beineke, Susan E Daniels, Whittemore G Tingley, Steven Rosenberg, Szilard Voros, William E Kraus, Geoffrey S Ginsburg, Robert S Schwartz, Stephen G Ellis, Naheem Tahirkheli, Ron Waksman, John McPherson, Alexandra J Lansky, Eric J Topol

Abstract

Background: Alterations in gene expression in peripheral blood cells have been shown to be sensitive to the presence and extent of coronary artery disease (CAD). A non-invasive blood test that could reliably assess obstructive CAD likelihood would have diagnostic utility.

Results: Microarray analysis of RNA samples from a 195 patient Duke CATHGEN registry case:control cohort yielded 2,438 genes with significant CAD association (p < 0.05), and identified the clinical/demographic factors with the largest effects on gene expression as age, sex, and diabetic status. RT-PCR analysis of 88 CAD classifier genes confirmed that diabetic status was the largest clinical factor affecting CAD associated gene expression changes. A second microarray cohort analysis limited to non-diabetics from the multi-center PREDICT study (198 patients; 99 case: control pairs matched for age and sex) evaluated gene expression, clinical, and cell population predictors of CAD and yielded 5,935 CAD genes (p < 0.05) with an intersection of 655 genes with the CATHGEN results. Biological pathway (gene ontology and literature) and statistical analyses (hierarchical clustering and logistic regression) were used in combination to select 113 genes for RT-PCR analysis including CAD classifiers, cell-type specific markers, and normalization genes.RT-PCR analysis of these 113 genes in a PREDICT cohort of 640 non-diabetic subject samples was used for algorithm development. Gene expression correlations identified clusters of CAD classifier genes which were reduced to meta-genes using LASSO. The final classifier for assessment of obstructive CAD was derived by Ridge Regression and contained sex-specific age functions and 6 meta-gene terms, comprising 23 genes. This algorithm showed a cross-validated estimated AUC = 0.77 (95% CI 0.73-0.81) in ROC analysis.

Conclusions: We have developed a whole blood classifier based on gene expression, age and sex for the assessment of obstructive CAD in non-diabetic patients from a combination of microarray and RT-PCR data derived from studies of patients clinically indicated for invasive angiography.

Clinical trial registration information: PREDICT, Personalized Risk Evaluation and Diagnosis in the Coronary Tree, http://www.clinicaltrials.gov, NCT00500617.

Figures

Figure 1
Figure 1
Gene discovery and algorithm development patient and logic flow schematic. Initial gene discovery (CATHGEN repository) included both diabetic and non-diabetic patients. Gene discovery from PREDICT involved non-diabetic patients in a paired microarray analysis, that yielded 655 significant genes in common with those from the CATHGEN arrays. For RT-PCR 113 genes were selected and tested on 640 PREDICT patient samples, from which the final algorithm was derived and locked.
Figure 2
Figure 2
RT-PCR analysis of diabetic status impact on significant genes from CATHGEN microarray analysis. Significance of individual genes selected from the CATHGEN microarray cohort in non-diabetic (ND) and diabetic (D) patients is shown. The sex/age adjusted p values from a CAD logistic regression analysis in each subset are plotted (log scale). Significant p values (<0.05) are indicated in red with gene symbols, non-significant ones in black.
Figure 3
Figure 3
Venn diagram of microarray, RT-PCR, and algorithm gene sources. A total of 7718 genes were identified, 2438 and 5935, respectively, from the CATHGEN and PREDICT microarray analyses, with an intersection of 655 genes. For the 113 RT-PCR genes, 52 were from PREDICT, 22 from CATHGEN, and 29 from both; 10 were either normalization genes or from previous studies [13]. The final algorithm contained 20 informative genes: 10 from both microarray studies, 8 PREDICT alone, and 2 CATHGEN alone.
Figure 4
Figure 4
Gene ontology analysis of 655 CAD genes identified from microarray studies. The 655 CAD genes identified were analyzed using the BINGO algorithm to ascertain significant biological processes. Significant processes (p < 0.01 after FDR correction) are colored with the gradient of p values reflected in the colors as indicated, and the biological process annotated. A total of 55 processes were significant in this analysis at p < 0.05.
Figure 5
Figure 5
Heat-Map representation of Hierarchical Clustering Results on 113 RT-PCR Genes. Clusters were generated by hierarchical clustering yielding 20 groups of correlated. Clusters were annotated as to cell type expression using BioGPS (http://www.biogps.gnf.org). Extent of correlation is indicated by color as shown in the bar.
Figure 6
Figure 6
Schematic of the final algorithm structure and genes. The algorithm consists of overlapping gene expression functions for men and women with a sex-specific linear age function for the former and a non-linear age function for the latter. The genes in each term and their weights are shown. For the gene expression components, 16/23 genes in 4 terms are gender independent: Term 1 - neutrophil activation and apoptosis, Term 3 - NK cell activation to T cell ratio, Term 4, B to T cell ratio, and Term 5 -AF289562 expression normalized to TFCP2 and HNRPF. In addition, Term 2 consists of 3 sex-independent neutrophil/innate immunity genes (S100A8, S100A12, CLEC4E) normalized to overall neutrophil gene expression (AQP9, NCF4) for women and to RPL28 (lymphocytes) for men. The final male specific term is the normalized expression of TSPAN16. Algorithm score is calculated as described (Additional file 1).
Figure 7
Figure 7
ROC analysis of final algorithm. The cross-validated ROC curve for the final algorithm in the algorithm development cohort is shown. The AUC is 0.77 ± 0.04.

References

    1. Simon R. Roadmap for developing and validating therapeutically relevant genomic classifiers. J Clin Oncol. 2005;23(29):7332–7341. doi: 10.1200/JCO.2005.02.8712.
    1. Deng MC, Eisen HJ, Mehra MR, Billingham M, Marboe CC, Berry G, Kobashigawa J, Johnson FL, Starling RC, Murali S. et al.Noninvasive discrimination of rejection in cardiac allograft recipients using gene expression profiling. Am J Transplant. 2006;6(1):150–160. doi: 10.1111/j.1600-6143.2005.01175.x.
    1. Paik S, Shak S, Tang G, Kim C, Baker J, Cronin M, Baehner FL, Walker MG, Watson D, Park T. et al.A multigene assay to predict recurrence of tamoxifen-treated, node-negative breast cancer. N Engl J Med. 2004;351(27):2817–2826. doi: 10.1056/NEJMoa041588.
    1. Subramanian J, Simon R. What should physicians look for in evaluating prognostic gene-expression signatures? Nat Rev Clin Oncol. 2010;7(6):327–334. doi: 10.1038/nrclinonc.2010.60.
    1. Aziz H, Zaas A, Ginsburg GS. Peripheral blood gene expression profiling for cardiovascular disease assessment. Genomic Medicine. 2007;1(3):105–112. doi: 10.1007/s11568-008-9017-x.
    1. Rosenberg S, Elashoff MR, Beineke P, Daniels SE, Wingrove JA, Tingley WG, Sager PT, Sehnert AJ, Yau M, Kraus WE. et al.Multicenter validation of the diagnostic accuracy of a blood-based gene expression test for assessing obstructive coronary artery disease in nondiabetic patients. Ann Intern Med. 2010;153(7):425–434.
    1. Diamond GA, Forrester JS. Analysis of probability as an aid in the clinical diagnosis of coronary-artery disease. N Engl J Med. 1979;300(24):1350–1358. doi: 10.1056/NEJM197906143002402.
    1. Chaitman BR, Bourassa MG, Davis K, Rogers WJ, Tyras DH, Berger R, Kennedy JW, Fisher L, Judkins MP, Mock MB. et al.Angiographic prevalence of high-risk coronary artery disease in patient subsets (CASS) Circulation. 1981;64(2):360–367.
    1. Ridker PM, Buring JE, Rifai N, Cook NR. Development and validation of improved algorithms for the assessment of global cardiovascular risk in women: the Reynolds Risk Score. Jama. 2007;297(6):611–619. doi: 10.1001/jama.297.6.611.
    1. Hansson GK, Libby P, Schonbeck U, Yan ZQ. Innate and adaptive immunity in the pathogenesis of atherosclerosis. Circ Res. 2002;91(4):281–291. doi: 10.1161/01.RES.0000029784.15893.10.
    1. Libby P, Ridker PM, Maseri A. Inflammation and atherosclerosis. Circulation. 2002;105(9):1135–1143. doi: 10.1161/hc0902.104353.
    1. Sinnaeve PR, Donahue MP, Grass P, Seo D, Vonderscher J, Chibout SD, Kraus WE, Sketch M Jr, Nelson C, Ginsburg GS. et al.Gene expression patterns in peripheral blood correlate with the extent of coronary artery disease. PLoS One. 2009;4(9):e7037. doi: 10.1371/journal.pone.0007037.
    1. Wingrove JA, Daniels SE, Sehnert AJ, Tingley W, Elashoff MR, Rosenberg S, Buellesfeld L, Grube E, Newby LK, Ginsburg GS. et al.Correlation of Peripheral-Blood Gene Expression With the Extent of Coronary Artery Stenosis. Circulation: Cardiovascular Genetics. 2008;1(1):31–38. doi: 10.1161/CIRCGENETICS.108.782730.
    1. Horne BD, Anderson JL, John JM, Weaver A, Bair TL, Jensen KR, Renlund DG, Muhlestein JB. Which white blood cell subtypes predict increased cardiovascular risk? J Am Coll Cardiol. 2005;45(10):1638–1643. doi: 10.1016/j.jacc.2005.02.054.
    1. Patel MR, Peterson ED, Dai D, Brennan JM, Redberg RF, Anderson HV, Brindis RG, Douglas PS. Low diagnostic yield of elective coronary angiography. N Engl J Med. 2010;362(10):886–895. doi: 10.1056/NEJMoa0907272.
    1. Wang L, Hauser ER, Shah SH, Pericak-Vance MA, Haynes C, Crosslin D, Harris M, Nelson S, Hale AB, Granger CB. et al.Peakwide mapping on chromosome 3q13 identifies the kalirin gene as a novel candidate gene for coronary artery disease. Am J Hum Genet. 2007;80(4):650–663. doi: 10.1086/512981.
    1. Maere S, Heymans K, Kuiper M. BiNGO: a Cytoscape plugin to assess overrepresentation of gene ontology categories in biological networks. Bioinformatics. 2005;21(16):3448–3449. doi: 10.1093/bioinformatics/bti551.
    1. Brunet JP, Tamayo P, Golub TR, Mesirov JP. Metagenes and molecular pattern discovery using matrix factorization. Proc Natl Acad Sci USA. 2004;101(12):4164–4169. doi: 10.1073/pnas.0308531101.
    1. Tibshirani R. Regression shrinkage and selection via the lasso. J Royal Statistical Society B. 1996;58:267–288.
    1. Brown PJ. Measurement, Regression, and Calibration. Oxford, UK: Oxford University Press; 1994.
    1. Ibebuogu UN, Nasir K, Gopal A, Ahmadi N, Mao SS, Young E, Honoris L, Nuguri VK, Lee RS, Usman N. et al.Comparison of atherosclerotic plaque burden and composition between diabetic and non diabetic patients by non invasive CT angiography. Int J Cardiovasc Imaging. 2009;25(7):717–723. doi: 10.1007/s10554-009-9483-9.
    1. Hamblin M, Chang L, Fan Y, Zhang J, Chen YE. PPARs and the cardiovascular system. Antioxid Redox Signal. 2009;11(6):1415–1452. doi: 10.1089/ars.2008.2280.
    1. Ellegren H, Parsch J. The evolution of sex-biased genes and sex-biased gene expression. Nat Rev Genet. 2007;8(9):689–698. doi: 10.1038/nrg2167.
    1. Hong MG, Myers AJ, Magnusson PK, Prince JA. Transcriptome-wide assessment of human brain and lymphocyte senescence. PLoS One. 2008;3(8):e3024. doi: 10.1371/journal.pone.0003024.
    1. Rana JS, Boekholdt SM, Ridker PM, Jukema JW, Luben R, Bingham SA, Day NE, Wareham NJ, Kastelein JJ, Khaw KT. Differential leucocyte count and the risk of future coronary artery disease in healthy men and women: the EPIC-Norfolk Prospective Population Study. J Intern Med. 2007;262(6):678–689. doi: 10.1111/j.1365-2796.2007.01864.x.
    1. Li C, Engstrom G, Hedblad B. Leukocyte count is associated with incidence of coronary events, but not with stroke: a prospective cohort study. Atherosclerosis. 2009;209(2):545–550. doi: 10.1016/j.atherosclerosis.2009.09.029.
    1. Su AI, Wiltshire T, Batalov S, Lapp H, Ching KA, Block D, Zhang J, Soden R, Hayakawa M, Kreiman G. et al.A gene atlas of the mouse and human protein-encoding transcriptomes. Proc Natl Acad Sci USA. 2004;101(16):6062–6067. doi: 10.1073/pnas.0400782101.
    1. Drechsler M, Megens RT, van Zandvoort M, Weber C, Soehnlein O. Hyperlipidemia-Triggered Neutrophilia Promotes Early Atherosclerosis. Circulation. 2010;2010:18.
    1. Zernecke A, Bot I, Djalali-Talab Y, Shagdarsuren E, Bidzhekov K, Meiler S, Krohn R, Schober A, Sperandio M, Soehnlein O. et al.Protective role of CXC receptor 4/CXC ligand 12 unveils the importance of neutrophils in atherosclerosis. Circ Res. 2008;102(2):209–217. doi: 10.1161/CIRCRESAHA.107.160697.
    1. Hasegawa H, Yamada Y, Harasawa H, Tsuji T, Murata K, Sugahara K, Tsuruda K, Masuda M, Takasu N, Kamihira S. Restricted expression of tumor necrosis factor-related apoptosis-inducing ligand receptor 4 in human peripheral blood lymphocytes. Cell Immunol. 2004;231(1-2):1–7. doi: 10.1016/j.cellimm.2004.11.001.
    1. Lim SY, Raftery MJ, Goyette J, Hsu K, Geczy CL. Oxidative modifications of S100 proteins: functional regulation by redox. J Leukoc Biol. 2009.
    1. Yamasaki S, Ishikawa E, Sakuma M, Hara H, Ogata K, Saito T. Mincle is an ITAM-coupled activating receptor that senses damaged cells. Nat Immunol. 2008;9(10):1179–1188. doi: 10.1038/ni.1651.
    1. Teixeira VH, Olaso R, Martin-Magniette ML, Lasbleiz S, Jacq L, Oliveira CR, Hilliquin P, Gut I, Cornelis F, Petit-Teixeira E. Transcriptome analysis describing new immunity and defense genes in peripheral blood mononuclear cells of rheumatoid arthritis patients. PLoS One. 2009;4(8):e6803. doi: 10.1371/journal.pone.0006803.
    1. Chung CP, Oeser A, Raggi P, Gebretsadik T, Shintani AK, Sokka T, Pincus T, Avalos I, Stein CM. Increased coronary-artery atherosclerosis in rheumatoid arthritis: relationship to disease duration and cardiovascular risk factors. Arthritis Rheum. 2005;52(10):3045–3053. doi: 10.1002/art.21288.
    1. Cruz-Munoz ME, Dong Z, Shi X, Zhang S, Veillette A. Influence of CRACC, a SLAM family receptor coupled to the adaptor EAT-2, on natural killer cell function. Nat Immunol. 2009;10(3):297–305. doi: 10.1038/ni.1693.
    1. Kim DK, Kabat J, Borrego F, Sanni TB, You CH, Coligan JE. Human NKG2F is expressed and can associate with DAP12. Mol Immunol. 2004;41(1):53–62. doi: 10.1016/j.molimm.2004.01.004.
    1. Whitman SC, Rateri DL, Szilvassy SJ, Yokoyama W, Daugherty A. Depletion of natural killer cell function decreases atherosclerosis in low-density lipoprotein receptor null mice. Arterioscler Thromb Vasc Biol. 2004;24(6):1049–1054. doi: 10.1161/01.ATV.0000124923.95545.2c.
    1. Major AS, Fazio S, Linton MF. B-lymphocyte deficiency increases atherosclerosis in LDL receptor-null mice. Arterioscler Thromb Vasc Biol. 2002;22(11):1892–1898. doi: 10.1161/.
    1. Robertson AK, Hansson GK. T cells in atherogenesis: for better or for worse? Arterioscler Thromb Vasc Biol. 2006;26(11):2421–2432. doi: 10.1161/01.ATV.0000245830.29764.84.
    1. Ait-Oufella H, Herbin O, Bouaziz JD, Binder CJ, Uyttenhove C, Laurans L, Taleb S, Van Vre E, Esposito B, Vilar J. et al.B cell depletion reduces the development of atherosclerosis in mice. J Exp Med. 2010;207(8):1579–1587. doi: 10.1084/jem.20100155.
    1. Park MY, Hastie T, Tibshirani R. Averaged gene expressions for regression. Biostatistics. 2007;8(2):212–227. doi: 10.1093/biostatistics/kxl002.
    1. Zou H, Hastie T. Regularization and variable selection via the elastic net. J R Statist Soc B. 2005;67:301–320. doi: 10.1111/j.1467-9868.2005.00503.x.
    1. Shen-Orr SS, Tibshirani R, Khatri P, Bodian DL, Staedtler F, Perry NM, Hastie T, Sarwal MM, Davis MM, Butte AJ. Cell type-specific gene expression differences in complex tissues. Nat Methods. 2010;7(4):287–289. doi: 10.1038/nmeth.1439.

Source: PubMed

3
Suscribir