Machine learning for prediction of all-cause mortality in patients with suspected coronary artery disease: a 5-year multicentre prospective registry analysis

Manish Motwani, Damini Dey, Daniel S Berman, Guido Germano, Stephan Achenbach, Mouaz H Al-Mallah, Daniele Andreini, Matthew J Budoff, Filippo Cademartiri, Tracy Q Callister, Hyuk-Jae Chang, Kavitha Chinnaiyan, Benjamin J W Chow, Ricardo C Cury, Augustin Delago, Millie Gomez, Heidi Gransar, Martin Hadamitzky, Joerg Hausleiter, Niree Hindoyan, Gudrun Feuchtner, Philipp A Kaufmann, Yong-Jin Kim, Jonathon Leipsic, Fay Y Lin, Erica Maffei, Hugo Marques, Gianluca Pontone, Gilbert Raff, Ronen Rubinshtein, Leslee J Shaw, Julia Stehli, Todd C Villines, Allison Dunning, James K Min, Piotr J Slomka, Manish Motwani, Damini Dey, Daniel S Berman, Guido Germano, Stephan Achenbach, Mouaz H Al-Mallah, Daniele Andreini, Matthew J Budoff, Filippo Cademartiri, Tracy Q Callister, Hyuk-Jae Chang, Kavitha Chinnaiyan, Benjamin J W Chow, Ricardo C Cury, Augustin Delago, Millie Gomez, Heidi Gransar, Martin Hadamitzky, Joerg Hausleiter, Niree Hindoyan, Gudrun Feuchtner, Philipp A Kaufmann, Yong-Jin Kim, Jonathon Leipsic, Fay Y Lin, Erica Maffei, Hugo Marques, Gianluca Pontone, Gilbert Raff, Ronen Rubinshtein, Leslee J Shaw, Julia Stehli, Todd C Villines, Allison Dunning, James K Min, Piotr J Slomka

Abstract

Aims: Traditional prognostic risk assessment in patients undergoing non-invasive imaging is based upon a limited selection of clinical and imaging findings. Machine learning (ML) can consider a greater number and complexity of variables. Therefore, we investigated the feasibility and accuracy of ML to predict 5-year all-cause mortality (ACM) in patients undergoing coronary computed tomographic angiography (CCTA), and compared the performance to existing clinical or CCTA metrics.

Methods and results: The analysis included 10 030 patients with suspected coronary artery disease and 5-year follow-up from the COronary CT Angiography EvaluatioN For Clinical Outcomes: An InteRnational Multicenter registry. All patients underwent CCTA as their standard of care. Twenty-five clinical and 44 CCTA parameters were evaluated, including segment stenosis score (SSS), segment involvement score (SIS), modified Duke index (DI), number of segments with non-calcified, mixed or calcified plaques, age, sex, gender, standard cardiovascular risk factors, and Framingham risk score (FRS). Machine learning involved automated feature selection by information gain ranking, model building with a boosted ensemble algorithm, and 10-fold stratified cross-validation. Seven hundred and forty-five patients died during 5-year follow-up. Machine learning exhibited a higher area-under-curve compared with the FRS or CCTA severity scores alone (SSS, SIS, DI) for predicting all-cause mortality (ML: 0.79 vs. FRS: 0.61, SSS: 0.64, SIS: 0.64, DI: 0.62; P< 0.001).

Conclusions: Machine learning combining clinical and CCTA data was found to predict 5-year ACM significantly better than existing clinical or CCTA metrics alone.

Keywords: Coronary CT angiography; Coronary artery disease; Machine learning; Prognosis.

Published on behalf of the European Society of Cardiology. All rights reserved. © The Author 2016. For permissions please email: journals.permissions@oup.com.

Figures

Figure 1
Figure 1
Feature selection. Forty-four coronary computed tomographic angiography variables (blue) and 25 clinical variables (green) were available. Information gain ranking was used to evaluate the worth of each variable by measuring the entropy gain with respect to the outcome, and then rank the attributes by their individual evaluations (top to bottom). Only attributes resulting in information gain >0 (above red line) were subsequently used in boosting. This figure shows the results from one representative fold of the cross-validation procedure. Variables are followed by units, or categorical range in parentheses—full details in Supplementary material online, Appendix. ACM, all-cause mortality; BMI, body mass index; CAD, coronary artery disease; CCS, coronary calcium score; CVA, cerebrovascular accident; D, diagonal; DM, diabetes mellitus; EF, ejection fraction; F, female; FRS, Framingham risk score; HDL, high-density lipoprotein; HTN, hypertension; LM, left main; LAD, left anterior descending artery; LCX, left circumflex; LDL, low-density lipoprotein M, male; NoV, number of vessels; Nr., number of; OM, obtuse marginal; PAD, peripheral arterial disease; PL, posterolateral branch; RCA, right coronary artery; sev, severe; SSS, segment stenosis score; SIS, segment involvement score.
Figure 2
Figure 2
Computational methods. Machine learning involved automated feature selection by information gain ranking, model building with a boosted ensemble algorithm (LogitBoost), and 10-fold stratified cross-validation.
Figure 3
Figure 3
Receiver-operating characteristic curves for prediction of 5-year all-cause mortality. Machine learning using feature selection and a LogitBoost model had a significantly higher area-under-the-curve for all-cause mortality prediction than all other scores (P < 0.001)†. Area-under-the-curves for segment stenosis score and segment involvement score were greater than Framingham risk score (P < 0.05)‡. FRS, Framingham risk score; SSS, segment stenosis score; SIS, segment involvement score; DI, modified Duke index.
Figure 4
Figure 4
Calibration plot for LogitBoost model. The calibration plot shows the relationship between the observed and predicted proportion of events, grouped by decile of risk. The LogitBoost model showed good calibration with the observed 5-year risk of all-cause mortality.

Source: PubMed

3
Subscribe