Competing risks data analysis with high-dimensional covariates: an application in bladder cancer

Leili Tapak, Massoud Saidijam, Majid Sadeghifar, Jalal Poorolajal, Hossein Mahjub, Leili Tapak, Massoud Saidijam, Majid Sadeghifar, Jalal Poorolajal, Hossein Mahjub

Abstract

Analysis of microarray data is associated with the methodological problems of high dimension and small sample size. Various methods have been used for variable selection in high-dimension and small sample size cases with a single survival endpoint. However, little effort has been directed toward addressing competing risks where there is more than one failure risks. This study compared three typical variable selection techniques including Lasso, elastic net, and likelihood-based boosting for high-dimensional time-to-event data with competing risks. The performance of these methods was evaluated via a simulation study by analyzing a real dataset related to bladder cancer patients using time-dependent receiver operator characteristic (ROC) curve and bootstrap .632+ prediction error curves. The elastic net penalization method was shown to outperform Lasso and boosting. Based on the elastic net, 33 genes out of 1381 genes related to bladder cancer were selected. By fitting to the Fine and Gray model, eight genes were highly significant (P<0.001). Among them, expression of RTN4, SON, IGF1R, SNRPE, PTGR1, PLEK, and ETFDH was associated with a decrease in survival time, whereas SMARCAD1 expression was associated with an increase in survival time. This study indicates that the elastic net has a higher capacity than the Lasso and boosting for the prediction of survival time in bladder cancer patients. Moreover, genes selected by all methods improved the predictive power of the model based on only clinical variables, indicating the value of information contained in the microarray features.

Keywords: Cause-specific hazard; Competing risks; Elastic net; Lasso; Microarray; Subdistribution hazard.

Copyright © 2015 The Authors. Production and hosting by Elsevier Ltd.. All rights reserved.

Figures

Figure 1
Figure 1
The area under the ROC curve for bladder cancer data AUC value over time was presented in y-axis, survival time on x-axis was time to progression or death from bladder cancer (in week).
Figure 2
Figure 2
The prediction error curves for bladder cancer data Clinical model used age, sex, stage, grade and treatment as predictors. The elastic net, Lasso, and boosting used microarray features in addition to the clinical parameters as predictors.

References

    1. Dyrskjøt L., Zieger K., Real F.X., Malats N., Carrato A., Hurst C. Gene expression signatures predict outcome in non-muscle-invasive bladder carcinoma: a multicenter validation study. Clin Cancer Res. 2007;13:3545–3551.
    1. Hecker N., Stephan C., Mollenkopf H.-J., Jung K., Preissner R., Meyer H.-A. A new algorithm for integrated analysis of miRNA-mRNA interactions based on individual classification reveals insights into bladder cancer. PLoS One. 2013;8:e64543.
    1. Kaufman D.S., Shipley W.U., Feldman A.S. Bladder cancer. Lancet. 2009;374:239–249.
    1. Riester M., Taylor J.M., Feifer A., Koppie T., Rosenberg J.E., Downey R.J. Combination of a novel gene expression signature with a clinical nomogram improves the prediction of survival in high-risk bladder cancer. Clin Cancer Res. 2012;18:1323–1333.
    1. Engler D., Li Y. Survival analysis with high-dimensional covariates: an application in microarray studies. Stat Appl Genet Mol Biol. 2009;8:1–22.
    1. Binder H., Allignol A., Schumacher M., Beyersmann J. Boosting for high-dimensional time-to-event data with competing risks. Bioinformatics. 2009;25:890–896.
    1. Antoniadis A., Fryzlewicz P., Letué F. The Dantzig selector in Cox’s proportional hazards model. Scand J Stat. 2010;37:531–552.
    1. Fan J., Li R. Variable selection via nonconcave penalized likelihood and its oracle properties. J Am Stat Assoc. 2001;96:1348–1360.
    1. Gui J., Li H. Threshold gradient descent method for censored data regression, with applications in pharmacogenomics. Pac Symp Biocomput. 2005;10:272–283.
    1. Li H., Luan Y. Kernel Cox regression models for linking gene expression profiles to censored survival data. Pac Symp Biocomput. 2003;8:65–76.
    1. Gui J., Li H. Penalized Cox regression analysis in the high-dimensional and low-sample size settings, with applications to microarray gene expression data. Bioinformatics. 2005;21:3001–3008.
    1. Park M.Y., Hastie T. L1 regularization path algorithm for generalized linear models. J R Stat Soc Series B Stat Methodol. 2007;69:659–677.
    1. Binder H., Schumacher M. Allowing for mandatory covariates in boosting estimation of sparse high-dimensional survival models. BMC Bioinformatics. 2008;9:14.
    1. Zou H., Hastie T. Regularization and variable selection via the elastic net. J R Stat Soc Series B Stat Methodol. 2005;67:301–320.
    1. Yang J.-Y., Yoshihara K., Tanaka K., Hatae M., Masuzaki H., Itamochi H. Predicting time to ovarian carcinoma recurrence using protein markers. J Clin Invest. 2013;123:3740.
    1. Tibshirani R.J. Univariate shrinkage in the Cox model for high dimensional data. Stat Appl Genet Mol Biol. 2009;8:1–18.
    1. Chen C.-L., Lai Y.-F., Tang P., Chien K.-Y., Yu J.-S., Tsai C.-H. Comparative and targeted proteomic analyses of urinary microparticles from bladder cancer and hernia patients. J Proteome Res. 2012;11:5611–5629.
    1. Hickey C.J., Kim J.H., Ahn E.Y.E. New discoveries of old son: a link between RNA splicing and cancer. J Cell Biochem. 2014;115:224–231.
    1. Quan H., Tang H., Fang L., Bi J., Liu Y., Li H. IGF1 (CA) 19 and IGFBP-3-202A/C gene polymorphism and cancer risk: a meta-analysis. Cell Biochem Biophys. 2014;69:169–178.
    1. Moreira A., Meira-Machado L. SurvivalBIV: estimation of the bivariate distribution function for sequentially ordered events under univariate censoring. J Stat Softw. 2012;46:1–16.
    1. Pineda S., Milne R.L., Calle M.L., Rothman N., de Maturana E.L., Herranz J. Genetic variation in the TP53 pathway and bladder cancer risk. A comprehensive analysis. PLoS One. 2014;9:e89952.
    1. Morrione A., Neill T., Iozzo R.V. Dichotomy of decorin activity on the insulin-like growth factor-I system. FEBS J. 2013;280:2138–2149.
    1. Metalli D., Lovat F., Tripodi F., Genua M., Xu S.-Q., Spinelli M. The insulin-like growth factor receptor I promotes motility and invasion of bladder cancer cells through Akt-and mitogen-activated protein kinase-dependent activation of paxillin. Am J Pathol. 2010;176:2997–3006.
    1. Rochester M.A., Patel N., Turney B.W., Davies D.R., Roberts I.S., Crew J. The type 1 insulin-like growth factor receptor is over-expressed in bladder cancer. BJU Int. 2007;100:1396–1401.
    1. Tamura K., Furihata M., Tsunoda T., Ashida S., Takata R., Obara W. Molecular features of hormone-refractory prostate cancer cells by genome-wide gene expression profiles. Cancer Res. 2007;67:5117–5125.
    1. Quidville V., Alsafadi S., Goubar A., Commo F., Scott V., Pioche-Durieu C. Targeting the deregulated spliceosome core machinery in cancer cells triggers mTOR blockade and autophagy. Cancer Res. 2013;73:2247–2258.
    1. Cardous-Ubbink M., Heinen R., Bakker P., Van Den Berg H., Oldenburger F., Caron H. Risk of second malignancies in long-term survivors of childhood cancer. Eur J Cancer. 2007;43:351–362.
    1. Liu Y., Zeng L., Zhang S., Zeng S., Huang J., Tang Y. Identification of differentially expressed proteins in chemotherapy-sensitive and chemotherapy-resistant diffuse large B cell lymphoma by proteomic methods. Med Oncol. 2013;30:1–10.
    1. Sharron Lin X, Hu L, Sandy K, Correll M, Quackenbush J, Wu C-L, et al. Differentiating progressive from nonprogressive T1 bladder cancer by gene expression profiling: applying RNA-sequencing analysis on archived specimens. Urol Oncol: Seminars and original investigations: Elsevier; 2013.
    1. Lai K.-C., Lu C.-C., Tang Y.-J., Chiang J.-H., Kuo D.-H., Chen F.-A. Allyl isothiocyanate inhibits cell metastasis through suppression of the MAPK pathways in epidermal growth factor-stimulated HT29 human colorectal adenocarcinoma cells. Oncol Rep. 2014;31:189–196.
    1. Yu X., Erzinger M.M., Pietsch K.E., Cervoni-Curet F.N., Whang J., Niederhuber J. Up-regulation of human prostaglandin reductase 1 improves the efficacy of hydroxymethylacylfulvene, an antitumor chemotherapeutic agent. J Pharmacol Exp Ther. 2012;343:426–433.
    1. Schuetz A.N., Yin-Goen Q., Amin M.B., Moreno C.S., Cohen C., Hornsby C.D. Molecular classification of renal tumors by gene expression profiling. J Mol Diagn. 2005;7:206–218.
    1. Adra C.N., Donato J.-L., Badovinac R., Syed F., Kheraj R., Cai H. SMARCAD1, a novel human helicase family-defining member associated with genetic instability: cloning, expression, and mapping to 4q22–q23, a band rich in breakpoints and deletion mutants involved in several human diseases. Genomics. 2000;69:162–173.
    1. Ogutu J.O., Schulz-Streeck T., Piepho H.-P. Genomic selection using regularized linear regression models: ridge regression, lasso, elastic net and their extensions. BMC Proc. 2012;6(Suppl. 2):S10.
    1. Lin W., Lv J. High-dimensional sparse additive hazards regression. J Am Stat Assoc. 2013;108:247–264.
    1. Wu P., Walker B.A., Brewer D., Gregory W.M., Ashcroft J., Ross F.M. A gene expression–based predictor for myeloma patients at high risk of developing bone disease on bisphosphonate treatment. Clin Cancer Res. 2011;17:6347–6355.
    1. Fine J.P., Gray R.J. A proportional hazards model for the subdistribution of a competing risk. J Am Stat Assoc. 1999;94:496–509.
    1. Tibshirani R. Regression shrinkage and selection via the lasso. J R Stat Soc Series B Methodol. 1996:267–288.
    1. Goeman J.J. L1 penalized estimation in the cox proportional hazards model. Biom J. 2010;52:70–84.

Source: PubMed

3
Prenumerera