Development and validation of risk prediction algorithms to estimate future risk of common cancers in men and women: prospective cohort study

Julia Hippisley-Cox, Carol Coupland, Julia Hippisley-Cox, Carol Coupland

Abstract

Objective: To derive and validate a set of clinical risk prediction algorithm to estimate the 10-year risk of 11 common cancers.

Design: Prospective open cohort study using routinely collected data from 753 QResearch general practices in England. We used 565 practices to develop the scores and 188 for validation.

Subjects: 4.96 million patients aged 25-84 years in the derivation cohort; 1.64 million in the validation cohort. Patients were free of the relevant cancer at baseline.

Methods: Cox proportional hazards models in the derivation cohort to derive 10-year risk algorithms. Risk factors considered included age, ethnicity, deprivation, body mass index, smoking, alcohol, previous cancer diagnoses, family history of cancer, relevant comorbidities and medication. Measures of calibration and discrimination in the validation cohort.

Outcomes: Incident cases of blood, breast, bowel, gastro-oesophageal, lung, oral, ovarian, pancreas, prostate, renal tract and uterine cancers. Cancers were recorded on any one of four linked data sources (general practitioner (GP), mortality, hospital or cancer records).

Results: We identified 228,241 incident cases during follow-up of the 11 types of cancer. Of these 25,444 were blood; 41,315 breast; 32,626 bowel, 12,808 gastro-oesophageal; 32,187 lung; 4811 oral; 6635 ovarian; 7119 pancreatic; 35,256 prostate; 23,091 renal tract; 6949 uterine cancers. The lung cancer algorithm had the best performance with an R(2) of 64.2%; D statistic of 2.74; receiver operating characteristic curve statistic of 0.91 in women. The sensitivity for the top 10% of women at highest risk of lung cancer was 67%. Performance of the algorithms in men was very similar to that for women.

Conclusions: We have developed and validated a prediction models to quantify absolute risk of 11 common cancers. They can be used to identify patients at high risk of cancers for prevention or further assessment. The algorithms could be integrated into clinical computer systems and used to identify high-risk patients.

Web calculator: There is a simple web calculator to implement the Qcancer 10 year risk algorithm together with the open source software for download (available at http://qcancer.org/10yr/).

Keywords: PRIMARY CARE; QResearch; cancer.

Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.

Figures

Figure 1
Figure 1
Showing graphs of the adjusted HRs for the fractional polynomial terms for age for each cancer.
Figure 2
Figure 2
Showing graphs of the adjusted HRs for the fractional polynomial terms for body mass index for each cancer.
Figure 3
Figure 3
Showing graphs of the adjusted HRs for fractional polynomial terms for Townsend deprivation score for each cancer.
Figure 4
Figure 4
Showing graphs of the adjusted HRs for the interactions between age and family history for each relevant cancer.
Figure 5
Figure 5
Showing graphs of the adjusted HRs for the interactions between age and smoking status for each relevant cancer.
Figure 6
Figure 6
Showing the mean predicted risks and observed risks at 10 years by tenth of predicted risk applying each algorithm to all women in the validation cohort.
Figure 7
Figure 7
Showing the mean predicted risks and observed risks at 10 years by tenth of predicted risk applying each algorithm to all men in the validation cohort.
Figure 8
Figure 8
Showing the web calculator for an example patient.

References

    1. Berrino F, De Angelis R, Sant M et al. . Survival for eight major cancers and all cancers combined for European adults diagnosed in 1995–99: results of the EUROCARE-4 study. Lancet Oncol 2007;8:773–83. 10.1016/S1470-2045(07)70245-0
    1. Department of Health. The Cancer Reform Strategy. In: Health Do, ed. London: Department of Health, 2007.
    1. Hippisley-Cox J, Coupland C. Identifying patients with suspected gastro-oesophageal cancer in primary care: derivation and validation of an algorithm. Br J Gen Pract 2011;61:e707–14. 10.3399/bjgp11X606609
    1. Hippisley-Cox J, Coupland C. Identifying patients with suspected lung cancer in primary care: derivation and validation of an algorithm. Br J Gen Pract 2011;61:e715–23. 10.3399/bjgp11X606627
    1. Hippisley-Cox J, Coupland C. Identifying women with suspected ovarian cancer in primary care: derivation and validation of algorithm. BMJ 2012;344.
    1. Hippisley-Cox J, Coupland C. Identifying patients with suspected colorectal cancer in primary care: derivation and validation of an algorithm. Br J Gen Pract 2012;62:e29–37. 10.3399/bjgp12X616346
    1. Hippisley-Cox J, Coupland C. Identifying patients with suspected pancreatic cancer in primary care: derivation and validation of an algorithm. Br J Gen Pract 2012;62:e38–45. 10.3399/bjgp12X616355
    1. Hippisley-Cox J, Coupland C. Identifying patients with suspected renal tract cancer in primary care: derivation and validation of an algorithm. Br J Gen Pract 2012;62:e251–60. 10.3399/bjgp12X636074
    1. Hippisley-Cox J, Coupland C. Symptoms and risk factors to identify women with suspected cancer in primary care: derivation and validation of an algorithm. Br J Gen Pract 2013;63:11–21. 10.3399/bjgp13X660733
    1. Hippisley-Cox J, Coupland C. Symptoms and risk factors to identify men with suspected cancer in primary care: derivation and validation of an algorithm. Br J Gen Pract 2013;63:1–10. 10.3399/bjgp13X660724
    1. Office for National Statistics. Ten most common cancers among males and females 2014.
    1. Hippisley-Cox J, Coupland C. Predicting risk of emergency admission to hospital using primary care data: derivation and validation of QAdmissions score. BMJ Open 2013;3:e003482 10.1136/bmjopen-2013-003482
    1. Hippisley-Cox J. Validity and completeness of the NHS Number in primary and secondary care electronic data in England 1991–2013 2013;1 Hippisley-Cox J. Validity and completeness of the NHS number in primary and secondary care: electronic data in England 1991–2013. (accessed Jun 2013).
    1. Hippisley-Cox J, Vinogradova Y, Coupland C et al. . Risk of malignancy in patients with schizophrenia or bipolar disorder: nested case-control study. Arch Gen Psychiatry 2007;64:1368–76. 10.1001/archpsyc.64.12.1368
    1. Hippisley-Cox J, Coupland C. Unintended effects of statins in men and women in England and Wales: population based cohort study using the QResearch database. BMJ 2010;340:c2197 10.1136/bmj.c2197
    1. Hippisley-Cox J, Coupland C. Individualising the risks of statins in men and women in England and Wales: population-based cohort study. Heart 2010;96:939–47. 10.1136/hrt.2010.199034
    1. Vinogradova Y, Coupland C, Hippisley-Cox J. Exposure to cyclooxygenase-2 inhibitors and risk of cancer: nested case-control studies. Br J Cancer 2011;105:452–9. 10.1038/bjc.2011.252
    1. Cancer Research UK. Cancer Research Website 2014. (accessed 30th Jul 2014).
    1. Hippisley-Cox J, Coupland C, Vinogradova Y et al. . Predicting cardiovascular risk in England and Wales: prospective derivation and validation of QRISK2. BMJ 2008;336:1475–82. 10.1136/bmj.39609.449676.25
    1. Hippisley-Cox J, Coupland C, Vinogradova Y et al. . Derivation and validation of QRISK, a new cardiovascular disease risk score for the United Kingdom: prospective open cohort study. BMJ 2007;335:136 10.1136/bmj.39261.471806.55
    1. Hippisley-Cox J, Coupland C, Vinogradova Y et al. . Performance of the QRISK cardiovascular risk prediction algorithm in an independent UK sample of patients from general practice: a validation study. Heart 2008;94:34–9. 10.1136/hrt.2007.134890
    1. Collins GS, Altman DG. An independent external validation and evaluation of QRISK cardiovascular risk prediction: a prospective open cohort study. BMJ 2009;339:b2584 10.1136/bmj.b2584
    1. Schafer J, Graham J. Missing data: our view of the state of the art. Psychol Methods 2002;7:147–77. 10.1037/1082-989X.7.2.147
    1. Group TAM. Academic Medicine: problems and solutions. BMJ 1989;298:573–9. 10.1136/bmj.298.6673.573
    1. Steyerberg EW, van Veen M. Imputation is beneficial for handling missing data in predictive models. J Epidemiol Community Health 2007;60:979.
    1. Moons KGM, Donders RART, Stijnen T et al. . Using the outcome for imputation of missing predictor values was preferred. J Epidemiol Community Health 2006;59:1092.
    1. Rubin DB. Multiple imputation for non-response in surveys. New York: John Wiley, 1987.
    1. Royston P, Ambler G, Sauerbrei W. The use of fractional polynomials to model continuous risk variables in epidemiology. Int J Epidemiol 1999;28:964–74. 10.1093/ije/28.5.964
    1. Hosmer D, Lemeshow S. Applied logistic regression. New York: John Wiley & Sons, Inc., 1989.
    1. Hippisley-Cox J, Coupland C, Brindle P. The performance of seven QPrediction risk scores in an independent external sample of patients from general practice: a validation study. BMJ Open 2014;4:e005809 10.1136/bmjopen-2014-005809
    1. Royston P. Explained variation for survival models. Stata J 2006;6:1–14.
    1. Royston P, Sauerbrei W. A new measure of prognostic separation in survival data. Stat Med 2004;23:723–48. 10.1002/sim.1621
    1. Tsoi KK, Pau CY, Wu WK et al. . Cigarette smoking and the risk of colorectal cancer: a meta-analysis of prospective cohort studies. Clin Gastroenterol Hepatol 2009;7:682–8.e1–5 10.1016/j.cgh.2009.02.016
    1. Freedman ND, Abnet CC, Leitzmann MF et al. . A prospective study of tobacco, alcohol, and the risk of esophageal and gastric cancer subtypes. Am J Epidemiol 2007;165:1424–33. 10.1093/aje/kwm051
    1. La Torre G, Chiaradia G, Gianfagna F et al. . Smoking status and gastric cancer risk: an updated meta-analysis of case-control studies published in the past ten years. Tumori 2009;95:13–22.
    1. Doll R, Peto R, Wheatley K et al. . Mortality in relation to smoking:40 years’ observations on male British doctors. BMJ 1994;309:901–11. 10.1136/bmj.309.6959.901
    1. Parkin DM, Boyd L, Walker LC. The fraction of cancer attributable to lifestyle and environmental factors in the UK in 2010. Br J Cancer 2011;105(Suppl 2):S77–81. 10.1038/bjc.2011.489
    1. Bosetti C, Lucenteforte E, Silverman DT et al. . Cigarette smoking and pancreatic cancer: an analysis from the International Pancreatic Cancer Case-Control Consortium (Panc4). Ann Oncol 2012;23:1880–8. 10.1093/annonc/mdr541
    1. Zou L, Zhong R, Shen N et al. . Non-linear dose-response relationship between cigarette smoking and pancreatic cancer risk: evidence from a meta-analysis of 42 observational studies. Eur J Cancer 2014;50:193–203. 10.1016/j.ejca.2013.08.014
    1. Viswanathan AN, Feskanich D, De Vivo I et al. . Smoking and the risk of endometrial cancer: results from the Nurses’ Health Study. Int J Cancer 2005;114:996–1001. 10.1002/ijc.20821
    1. Allen NE, Beral V, Casabonne D et al. . Moderate alcohol intake and cancer incidence in women. J Natl Cancer Inst 2009;101:296–305. 10.1093/jnci/djn514
    1. Hamajima N, Hirose K, Tajima K et al. . Alcohol, tobacco and breast cancer—collaborative reanalysis of individual data from 53 epidemiological studies, including 58,515 women with breast cancer and 95,067 women without the disease. Br J Cancer 2002;87:1234–45. 10.1038/sj.bjc.6600596
    1. Weikert C, Dietrich T, Boeing H et al. . Lifetime and baseline alcohol intake and risk of cancer of the upper aero-digestive tract in the European Prospective Investigation into Cancer and Nutrition (EPIC) study. Int J Cancer 2009;125:406–12. 10.1002/ijc.24393
    1. Johns LE, Houlston RS. A systematic review and meta-analysis of familial prostate cancer risk. BJU Int 2003;91:789–94. 10.1046/j.1464-410X.2003.04232.x
    1. Pharoah PD, Day NE, Duffy S et al. . Family history and the risk of breast cancer: a systematic review and meta-analysis. Int J Cancer 1997;71:800–9. 10.1002/(SICI)1097-0215(19970529)71:5<800::AID-IJC18>;2-B
    1. Collaborative Group on Hormonal Factors in Breast C. Familial breast cancer: collaborative reanalysis of individual data from 52 epidemiological studies including 58,209 women with breast cancer and 101,986 women without the disease. Lancet 2001;358:1389–99. 10.1016/S0140-6736(01)06524-2
    1. Fearnhead NS, Wilding JL, Bodmer WF. Genetics of colorectal cancer: hereditary aspects and overview of colorectal tumorigenesis. Br Med Bull 2002;64:27–43. 10.1093/bmb/64.1.27
    1. Butterworth AS, Higgins JP, Pharoah P. Relative and absolute risk of colorectal cancer for individuals with a family history: a meta-analysis. Eur J Cancer 2006;42:216–27. 10.1016/j.ejca.2005.09.023
    1. Cote ML, Liu M, Bonassi S et al. . Increased risk of lung cancer in individuals with a family history of the disease: a pooled analysis from the International Lung Cancer Consortium. Eur J Cancer 2012;48:1957–68. 10.1016/j.ejca.2012.01.038
    1. Gayther SA, Pharoah PD. The inherited genetics of ovarian and endometrial cancer. Curr Opin Genet Dev 2010;20:231–8. 10.1016/j.gde.2010.03.001
    1. Granstrom C, Sundquist J, Hemminki K. Population attributable fractions for ovarian cancer in Swedish women by morphological type. Br J Cancer 2008;98:199–205. 10.1038/sj.bjc.6604135
    1. Cancer CGoHFiB. Breast cancer and hormonal contraceptives: collaborative reanalysis of individual data on 53 297 women with breast cancer and 100 239 women without breast cancer from 54 epidemiological studies. Lancet 1996;347:1713–27. 10.1016/S0140-6736(96)90806-5
    1. Million Women Study Collaborators. Breast cancer and hormone replacement therapy in the Million Women Study. Lancet 2003;362:419–27. 10.1016/S0140-6736(03)14065-2
    1. Chlebowski RT, Manson JE, Anderson GL et al. . Estrogen plus progestin and breast cancer incidence and mortality in the Women's Health Initiative Observational Study. J Natl Cancer Inst 2013;105:526–35. 10.1093/jnci/djt043
    1. Havrilesky LJ, Gierisch JM, Moorman PG et al. . Oral contraceptive use for the primary prevention of ovarian cancer. Evid Rep Technol Assess (Full Rep) 2013(212):1–514.
    1. Salehi F, Dunfield L, Phillips KP et al. . Risk factors for ovarian cancer: an overview with emphasis on hormonal factors. J Toxicol Environ Health B Crit Rev 2008;11:301–21.
    1. Lutgens MW, van Oijen MG, van der Heijden GJ et al. . Declining risk of colorectal cancer in inflammatory bowel disease: an updated meta-analysis of population-based cohort studies. Inflamm Bowel Dis 2013;19:789–99. 10.1097/MIB.0b013e31828029c0
    1. Castano-Milla C, Chaparro M, Gisbert JP. Systematic review with meta-analysis: the declining risk of colorectal cancer in ulcerative colitis. Aliment Pharmacol Ther 2014;39:645–59. 10.1111/apt.12651
    1. Hvid-Jensen F, Pedersen L, Drewes AM et al. . Incidence of adenocarcinoma among patients with Barrett's esophagus. N Engl J Med 2011;365:1375–83. 10.1056/NEJMoa1103042
    1. Duell EJ, Lucenteforte E, Olson SH et al. . Pancreatitis and pancreatic cancer risk: a pooled analysis in the International Pancreatic Cancer Case-Control Consortium (PanC4). Ann Oncol 2012;23:2964–70. 10.1093/annonc/mds140
    1. Zhou WB, Xue DQ, Liu XA et al. . The influence of family history and histological stratification on breast cancer risk in women with benign breast disease: a meta-analysis. J Cancer Res Clin Oncol 2011;137:1053–60. 10.1007/s00432-011-0979-z
    1. Torrey EF. Prostate cancer and schizophrenia. Urology 2006;68:1280–3. 10.1016/j.urology.2006.08.1061
    1. Hardiman P, Pillay OC, Atiomo W. Polycystic ovary syndrome and endometrial carcinoma. Lancet 2003;361:1810–12. 10.1016/S0140-6736(03)13409-5
    1. Jiang Y, Ben Q, Shen H et al. . Diabetes mellitus and incidence and mortality of colorectal cancer: a systematic review and meta-analysis of cohort studies. Eur J Epidemiol 2011;26:863–76. 10.1007/s10654-011-9617-y
    1. Kramer HU, Schottker B, Raum E et al. . Type 2 diabetes mellitus and colorectal cancer: meta-analysis on sex-specific differences. Eur J Cancer 2012;48:1269–82. 10.1016/j.ejca.2011.07.010
    1. Luo W, Cao Y, Liao C et al. . Diabetes mellitus and the incidence and mortality of colorectal cancer: a meta-analysis of twenty four cohort studies. Colorectal Dis 2011; doi:10.1111/j.1463-1318.2011.02875.x.
    1. Wu L, Yu C, Jiang H et al. . Diabetes mellitus and the occurrence of colorectal cancer: an updated meta-analysis of cohort studies. Diabetes Technol Ther 2013;15:419–27. 10.1089/dia.2012.0263
    1. Shimoyama S. Diabetes mellitus carries a risk of gastric cancer: a meta-analysis. WJG 2013;19:6902–10. 10.3748/wjg.v19.i40.6902
    1. Starup-Linde J, Karlstad O, Eriksen SA et al. . CARING (CAncer Risk and INsulin analoGues): the association of diabetes mellitus and cancer risk with focus on possible determinants—a systematic review and a meta-analysis. Curr Drug Saf 2013;8:296–332. 10.2174/15748863113086660071
    1. Ben Q, Xu M, Ning X et al. . Diabetes mellitus and risk of pancreatic cancer: a meta-analysis of cohort studies. Eur J Cancer 2011;47:1928–37. 10.1016/j.ejca.2011.03.003
    1. Zhang ZH, Su PY, Hao JH et al. . The role of preexisting diabetes mellitus on incidence and mortality of endometrial cancer: a meta-analysis of prospective cohort studies. Int J Gynecol Cancer 2013;23:294–303. 10.1097/IGC.0b013e31827b8430
    1. Zhang F, Yang Y, Skrip L et al. . Diabetes mellitus and risk of prostate cancer: an updated meta-analysis based on 12 case-control and 25 cohort studies. Acta Diabetol 2012;49(Suppl 1):S235–46. 10.1007/s00592-012-0439-5
    1. Xu H, Mao SH, Ding GX et al. . Diabetes mellitus reduces prostate cancer risk—no function of age at diagnosis or duration of disease. Asian Pac J Cancer Prev 2013;14:441–7. 10.7314/APJCP.2013.14.1.441
    1. UK CR. Cancer incidence and survival by major ethnic group in England, 2002–2006. Secondary Cancer incidence and survival by major ethnic group in England, 2002–2006 2009.
    1. Ben-Shlomo Y, Evans S, Ibrahim F et al. . The risk of prostate cancer amongst black men in the United Kingdom: the PROCESS cohort study. Eur Urol 2008;53:99–105. 10.1016/j.eururo.2007.02.047
    1. Youlden DR, Baade PD. The relative risk of second primary cancers in Queensland, Australia: a retrospective cohort study. BMC Cancer 2011;11:83 10.1186/1471-2407-11-83
    1. Jegu J, Colonna M, Daubisse-Marliac L et al. . The effect of patient characteristics on second primary cancer risk in France. BMC Cancer 2014;14:94 10.1186/1471-2407-14-94
    1. Dores GM, Metayer C, Curtis RE et al. . Second malignant neoplasms among long-term survivors of Hodgkin's disease: a population-based evaluation over 25 years. J Clin Oncol 2002;20:3484–94. 10.1200/JCO.2002.09.038
    1. National Clinical Guideline Centre. Lipid modification: cardiovascular risk assessment and the modification of blood lipids for the primary and secondary prevention of cardiovascular disease. London 2014:286.
    1. Hippisley-Cox J, Coupland C, Robson J et al. . Predicting risk of type 2 diabetes in England and Wales: prospective derivation and validation of QDScore. BMJ 2009;338:b880 10.1136/bmj.b880
    1. Hippisley-Cox J, Coupland C. Derivation and validation of updated QFracture algorithm to predict risk of osteoporotic fracture in primary care in the United Kingdom: prospective open cohort study. BMJ 2012;344:e3427 10.1136/bmj.e3427
    1. Hippisley-Cox J, Coupland C. Predicting the risk of chronic kidney disease in men and women in England and Wales: prospective derivation and external validation of the QKidney Scores. BMC Fam Pract 2010;11:49 10.1186/1471-2296-11-49
    1. Hippisley-Cox J, Coupland C. Development and validation of risk prediction algorithm (QThrombosis) to estimate future risk of venous thromboembolism: prospective cohort study. BMJ 2011;343:d4656 10.1136/bmj.d4656
    1. Hippisley-Cox J, Coupland C. Predicting risk of osteoporotic fracture in men and women in England and Wales: prospective derivation and validation of QFractureScores. BMJ 2009;339:b4229 10.1136/bmj.b4229
    1. Collins GS, Mallett S, Altman DG. Predicting risk of osteoporotic and hip fracture in the United Kingdom: prospective independent and external validation of QFractureScores. BMJ 2011;342:d3651 10.1136/bmj.d3651
    1. Collins GS, Altman DG. External validation of the QDScore for predicting the 10-year risk of developing type 2 diabetes. Diabet Med 2011;28:599–607. 10.1111/j.1464-5491.2011.03237.x
    1. Majeed A. Sources, uses, strengths and limitations of data collected in primary care in England. Health Stat Q 2004;21:5–14.
    1. Collins GS, Altman DG. Predicting the 10 year risk of cardiovascular disease in the United Kingdom: independent and external validation of an updated version of QRISK2. BMJ 2012;344:e4181 10.1136/bmj.e4181
    1. Collins GS, Altman DG. An independent and external validation of QRISK2 cardiovascular disease risk score: a prospective open cohort study. BMJ 2010;340:c2442 10.1136/bmj.c2442
    1. Wacholder S, Hartge P, Prentice R et al. . Performance of common genetic variants in breast-cancer risk models. N Engl J Med 2010;362:986–93. 10.1056/NEJMoa0907727
    1. Tomasetti C, Vogelstein B. Cancer etiology. Variation in cancer risk among tissues can be explained by the number of stem cell divisions. Science 2015;347:78–81. 10.1126/science.1260825
    1. Hippisley-Cox J, Coupland C. QRISK2–2014 Annual Update Information, 2014:5.
    1. Hippisley-Cox J. QDiabetes 2013 Annual Update Information Secondary QDiabetes 2013 Annual Update Information 2013. .
    1. Hippisley-Cox J, Coupland C, Brindle P. Derivation and validation of QStroke score for predicting risk of ischaemic stroke in primary care and comparison with other risk scores: a prospective open cohort study. BMJ 2013;346:f2573 10.1136/bmj.f2573
    1. Hippisley-Cox J, Coupland C. Predicting risk of upper gastrointestinal bleed and intracranial bleed with anticoagulants: cohort study to derive and validate the QBleed scores. BMJ 2014;349:g4606 10.1136/bmj.g4606
    1. Stead LF, Buitrago D, Preciado N et al. . Physician advice for smoking cessation. Cochrane Database Syst Rev 2013;5:CD000165.
    1. Kaner EF, Beyer F, Dickinson HO et al. . Effectiveness of brief alcohol interventions in primary care populations. Cochrane Database Syst Rev 2007(2):CD004148.
    1. Parkes G, Greenhalgh T, Griffin M et al. . Effect on smoking quit rate of telling patients their lung age: the Step2quit randomised controlled trial BMJ 2008;336:598–600.

Source: PubMed

3
Subscribe