Data resource profile: cardiovascular disease research using linked bespoke studies and electronic health records (CALIBER)

Spiros C Denaxas, Julie George, Emily Herrett, Anoop D Shah, Dipak Kalra, Aroon D Hingorani, Mika Kivimaki, Adam D Timmis, Liam Smeeth, Harry Hemingway, Spiros C Denaxas, Julie George, Emily Herrett, Anoop D Shah, Dipak Kalra, Aroon D Hingorani, Mika Kivimaki, Adam D Timmis, Liam Smeeth, Harry Hemingway

Abstract

The goal of cardiovascular disease (CVD) research using linked bespoke studies and electronic health records (CALIBER) is to provide evidence to inform health care and public health policy for CVDs across different stages of translation, from discovery, through evaluation in trials to implementation, where linkages to electronic health records provide new scientific opportunities. The initial approach of the CALIBER programme is characterized as follows: (i) Linkages of multiple electronic heath record sources: examples include linkages between the longitudinal primary care data from the Clinical Practice Research Datalink, the national registry of acute coronary syndromes (Myocardial Ischaemia National Audit Project), hospitalization and procedure data from Hospital Episode Statistics and cause-specific mortality and social deprivation data from the Office of National Statistics. Current cohort analyses involve a million people in initially healthy populations and disease registries with ∼10(5) patients. (ii) Linkages of bespoke investigator-led cohort studies (e.g. UK Biobank) to registry data (e.g. Myocardial Ischaemia National Audit Project), providing new means of ascertaining, validating and phenotyping disease. (iii) A common data model in which routine electronic health record data are made research ready, and sharable, by defining and curating with meta-data >300 variables (categorical, continuous, event) on risk factors, CVDs and non-cardiovascular comorbidities. (iv) Transparency: all CALIBER studies have an analytic protocol registered in the public domain, and data are available (safe haven model) for use subject to approvals. For more information, e-mail s.denaxas@ucl.ac.uk.

Figures

Figure 1
Figure 1
Longitudinal nature of multiple linked data sources in CALIBER. ECG = Electrocardiography, STEMI = ST-segment elevation Myocardial Infarction, ACEI = Angiotensin-converting-enzyme Inhibitor
Figure 2
Figure 2
The CALIBER framework of transforming raw electronic health record data into usable research-ready data sets
Figure 3
Figure 3
Example of a CALIBER cohort showing initial presentation of specific cardiac endpoints (n = 32 390) with counts and sources. Appendix A illustrates the approach to defining cardiovascular diseases using multiple record sources in CALIBER
Figure 4
Figure 4
Example of CALIBER research projects registered in the public domain
Figure 5
Figure 5
Example of one CALIBER research variable, hypertension, created from multiple raw electronic health record sources. The variable uses a combination of (i) repeat continuous blood pressure measurements; (ii) categorical data on measured blood pressure (over 130 Read codes); (iii) hypertension diagnosis in primary care (over 180 Read codes); and (iv) prescription of blood pressure lowering medications

References

    1. UK Clinical Research Collaboration. Report of Research Simulations. London: UKCRC; 2007. (UKCRC) Advisory Group to Connecting for Health.
    1. Medical Research Council (MRC) UK E-Health Records Research Capacity and Capability. London: MRC; 2011.
    1. Department of Health. The Power of Information: Putting All of Us in Control of the Health and Care Information We Need. London: Department of Health; 2012.
    1. Department of Business Innovation & Skills. Strategy for UK Life Sciences. London: Department of Business, Innovation and Skills; 2011.
    1. Timmis AD, Feder G, Hemingway H. Prognosis of stable angina pectoris: why we need larger population studies with higher endpoint resolution. Heart. 2007;93:786–91.
    1. Jernberg T, Attebring MF, Hambraeus K, et al. The Swedish Web-system for enhancement and development of evidence-based care in heart disease evaluated according to recommended therapies (SWEDEHEART) Heart. 2010;96:1617–21.
    1. Lindhardsen J, Ahlehoff O, Gislason GH, et al. Risk of atrial fibrillation and stroke in rheumatoid arthritis: Danish nationwide cohort study. BMJ. 2012;344:e1257.
    1. Sørensen R, Hansen ML, Abildstrom SZ, et al. Risk of bleeding in patients with acute myocardial infarction treated with different combinations of aspirin, clopidogrel, and vitamin K antagonists in Denmark: a retrospective analysis of nationwide registry data. Lancet. 2009;374:1967–74.
    1. Gershon AS, Warner L, Cascagnette P, Victor JC, To T. Lifetime risk of developing chronic obstructive pulmonary disease: a longitudinal population study. Lancet. 2011;378:991–96.
    1. St Sauver JL, Jacobson DJ, McGree ME, et al. Associations between longitudinal changes in serum estrogen, testosterone, and bioavailable testosterone and changes in benign urologic outcomes. Am J Epidemiol. 2011;173:787–96.
    1. The Research Program on Genes, Environment, and Health. (14 November 2012, date last accessed)
    1. Intermountain Healthcare Cardiovascular Research. (14 November 2012, date last accessed)
    1. Jang MJ, Bang SM, Oh D. Incidence of pregnancy-associated venous thromboembolism in Korea: from the Health Insurance Review and Assessment Service database. J Thromb Haem. 2011;9:2519–21.
    1. Colhoun HM. Use of insulin glargine and cancer incidence in Scotland: a study from the Scottish Diabetes Research Network Epidemiology Group. Diabetologia. 2009;52:1755–65.
    1. Scottish Health Informatics Programme. A Blueprint for Health Records Research in Scotland. Dundee: Scottish Health Informatics Programme, 2011.
    1. Ford DV, Jones KH, Verplancke JP, et al. The SAIL Databank: building a national architecture for e-health research and evaluation. BMC Health Serv Res. 2009;9:157.
    1. Lyons RA, Jones KH, John G, et al. The SAIL databank: linking multiple health and social care datasets. BMC Med Inf Dec Mak. 2009;9:3.
    1. Institute for Clinical Evaluative Sciences (ICES), (14 November 2012, date last accessed)
    1. McCarty CA, Chisholm RL, Chute CG, et al. The eMERGE Network: a consortium of biorepositories linked to electronic medical records data for conducting genomic studies. BMC Med Gen. 2011;4:13.
    1. Westfall JM, Mold J, Fagnan L. Practice-based research–“Blue Highways” on the NIH roadmap. JAMA. 2007;297:403–06.
    1. International Organization for Standardization (ISO). Health Informatics - Good Principles and Practices for a Clinical Data Warehouse (ISO/TR 22221:2006). Geneva: ISO, 2006.
    1. Ioannidis JP. Why most published research findings are false. PLoS Med. 2005;2:e124.
    1. Walley T, Mantgani A. The UK General Practice Research Database. Lancet. 1997;350:1097–99.
    1. Herrett E, Smeeth L, Walker L, Weston C. The Myocardial Ischaemia National Audit Project (MINAP) Heart. 2010;96:1264–67.
    1. Centre TH and SCI. Hospital Episodes Statistics (HES) 2011. Available from: (14 November 2012, date last accessed)
    1. Office for National Statistics. Mortality Statistics: Metadata 2010 Statistics. London: 2011. (14 November 2012, date last accessed)
    1. Simon C. Overview of the GP contract. InnovAiT. 2008;1:134–39.
    1. Chisholm J. The Read clinical classification. BMJ. 1990;300:1092.
    1. International Health Terminology Standards Development Organization. Systematized Nomenclature of Medicine—Clinical Terms (SNOMED-CT) [Internet]. (14 November 2012, date last accessed)
    1. ICD. International Classification of Diseases, 10th Revision, Clinical Modification (ICD-10-CM). 2011 release. Available from: (14 November 2012, date last accessed)
    1. OPCS-4 Classification — NHS Connecting for Health. Available from: (27 November 2012, date last accessed)
    1. Noble M, Mclennan D, Wilkinson K, Whitworth A. The English Indices of Deprivation 2007. London: Communities; 2007.
    1. Townsend P. Health and Deprivation: Inequality and the North. London: Routledge; 1988.
    1. Dave S, Petersen I. Creating medical and drug code lists to identify cases in primary care databases. Pharm Drug Saf. 18:704–07.
    1. Data Documentation Initiative (DDI) website. (14 November 2012, date last accessed)
    1. UK Collaborative Trial of Ovarian Cancer Screening (UKCTOCS), 2012. (14 November 2012, date last accessed)
    1. Marmot M, Brunner E. Cohort Profile: the Whitehall II study. Int J Epidemiol. 2005;34:251–56.
    1. Manolio TA, Weis BK, Cowie CC, et al. New models for large prospective studies: is there a better way? Am J Epidemiol. 2012;175:859–66.
    1. George J, Herrett E, Denaxas S, et al. Differential Effects of Smoking on Specific Cardiovascular Presentations in Men and Women: Prospective Cohort Study in 900,000 Patients Using CALIBER Linked Electronic Health Records. Los Angeles: American Heart Association Scientific Sessions, 2012.
    1. Rapsomaniki E, Shah AD, Denaxas S, et al. Prognostic Models for People with Stable Coronary Artery Disease Based on 115,500 Patients from the CALIBER Study. Munich: European Society of Cardiology (ESC), 2012.
    1. Chung SC, Gedeborg R, Nicholas O, et al. Comparative Effectiveness of Acute Myocardial Infarction Care Delivered in Sweden and the United Kingdom Using National Outcome Registries. Los Angeles: American Heart Association Scientific Sessions, 2012.
    1. Boggon R, van Staa TP, Timmis A, et al. Clopidogrel discontinuation after acute coronary syndromes: frequency, predictors and associations with death and myocardial infarction–a hospital registry-primary care linked cohort (MINAP-GPRD) Eur Heart J. 2011;32:2376–86.
    1. Douglas IJ, Evans SJW, Hingorani AD, et al. Clopidogrel and interaction with proton pump inhibitors: comparison between cohort and within person study designs. BMJ. 2012;345:e4388.
    1. Marston L, Carpenter JR, Walters KR, Morris RW, Nazareth I, Petersen I. Issues in multiple imputation of missing data for large general practice clinical databases. Pharmacoepidemiol Drug Saf. 2010;19:618–26.
    1. Bhattarai N, Charlton J, Rudisill C, Gulliford MC. Coding, recording and incidence of different forms of coronary heart disease in primary care. PLoS ONE. 2012;7:e29776.
    1. Lariscy JT. Differential record linkage by Hispanic ethnicity and age in linked mortality studies: implications for the epidemiologic paradox. J Aging and Health. 2011;23:1263–84.
    1. NHS Information Centre website. (14 November 2012, date last accessed)
    1. CALIBER. Cardiovascular Disease Research using Linked Bespoke Studies and Electronic Records Data Portal. 2010. (14 November 2012, date last accessed)

Source: PubMed

3
订阅