A comparison of Bayesian and frequentist approaches to incorporating external information for the prediction of prostate cancer risk

Paul J Newcombe, Brian H Reck, Jielin Sun, Greg T Platek, Claudio Verzilli, A Karim Kader, Seong-Tae Kim, Fang-Chi Hsu, Zheng Zhang, S Lilly Zheng, Vincent E Mooser, Lynn D Condreay, Colin F Spraggs, John C Whittaker, Roger S Rittmaster, Jianfeng Xu, Paul J Newcombe, Brian H Reck, Jielin Sun, Greg T Platek, Claudio Verzilli, A Karim Kader, Seong-Tae Kim, Fang-Chi Hsu, Zheng Zhang, S Lilly Zheng, Vincent E Mooser, Lynn D Condreay, Colin F Spraggs, John C Whittaker, Roger S Rittmaster, Jianfeng Xu

Abstract

We present the most comprehensive comparison to date of the predictive benefit of genetics in addition to currently used clinical variables, using genotype data for 33 single-nucleotide polymorphisms (SNPs) in 1,547 Caucasian men from the placebo arm of the REduction by DUtasteride of prostate Cancer Events (REDUCE®) trial. Moreover, we conducted a detailed comparison of three techniques for incorporating genetics into clinical risk prediction. The first method was a standard logistic regression model, which included separate terms for the clinical covariates and for each of the genetic markers. This approach ignores a substantial amount of external information concerning effect sizes for these Genome Wide Association Study (GWAS)-replicated SNPs. The second and third methods investigated two possible approaches to incorporating meta-analysed external SNP effect estimates - one via a weighted PCa 'risk' score based solely on the meta analysis estimates, and the other incorporating both the current and prior data via informative priors in a Bayesian logistic regression model. All methods demonstrated a slight improvement in predictive performance upon incorporation of genetics. The two methods that incorporated external information showed the greatest receiver-operating-characteristic AUCs increase from 0.61 to 0.64. The value of our methods comparison is likely to lie in observations of performance similarities, rather than difference, between three approaches of very different resource requirements. The two methods that included external information performed best, but only marginally despite substantial differences in complexity.

© 2011 Wiley Periodicals, Inc.

Figures

Fig. 1
Fig. 1
Predicted risk vs. observed risk for each predictive method with and without genetic information. These are calculated within deciles of predicted risk. Models were fitted and assessed under 10-fold cross-validation. Average 95% confidence intervals in the observed cross-validated risk are displayed. Clinical factors, included in every model, were family history, baseline age, PSA ratio, prostate volume and number of biopsy cores.
Fig. 2
Fig. 2
ROC analysis of each predictive method with and without genetic information. Models were fitted and assessed under 10-fold cross-validation. Clinical factors, included in every model, were family history, baseline age, PSA ratio, prostate volume and number of biopsy cores.
Fig. 3
Fig. 3
PPV vs. sensitivity for each predictive method with and without genetic information. Models were fitted and assessed under 10-fold cross-validation. Clinical factors, included in every model, were family history, baseline age, PSA ratio, prostate volume and number of biopsy cores.
Fig. 4
Fig. 4
Calibration between predicted risk from the frequentist method 2 and the Bayesian method 3, both of which incorporated external information.

Source: PubMed

3
Suscribir