Prediction model development of late-onset preeclampsia using machine learning-based methods

Jong Hyun Jhee, SungHee Lee, Yejin Park, Sang Eun Lee, Young Ah Kim, Shin-Wook Kang, Ja-Young Kwon, Jung Tak Park, Jong Hyun Jhee, SungHee Lee, Yejin Park, Sang Eun Lee, Young Ah Kim, Shin-Wook Kang, Ja-Young Kwon, Jung Tak Park

Abstract

Preeclampsia is one of the leading causes of maternal and fetal morbidity and mortality. Due to the lack of effective preventive measures, its prediction is essential to its prompt management. This study aimed to develop models using machine learning to predict late-onset preeclampsia using hospital electronic medical record data. The performance of the machine learning based models and models using conventional statistical methods were also compared. A total of 11,006 pregnant women who received antenatal care at Yonsei University Hospital were included. Maternal data were retrieved from electronic medical records during the early second trimester to 34 weeks. The prediction outcome was late-onset preeclampsia occurrence after 34 weeks' gestation. Pattern recognition and cluster analysis were used to select the parameters included in the prediction models. Logistic regression, decision tree model, naïve Bayes classification, support vector machine, random forest algorithm, and stochastic gradient boosting method were used to construct the prediction models. C-statistics was used to assess the performance of each model. The overall preeclampsia development rate was 4.7% (474 patients). Systolic blood pressure, serum blood urea nitrogen and creatinine levels, platelet counts, serum potassium level, white blood cell count, serum calcium level, and urinary protein were the most influential variables included in the prediction models. C-statistics for the decision tree model, naïve Bayes classification, support vector machine, random forest algorithm, stochastic gradient boosting method, and logistic regression models were 0.857, 0.776, 0.573, 0.894, 0.924, and 0.806, respectively. The stochastic gradient boosting model had the best prediction performance with an accuracy and false positive rate of 0.973 and 0.009, respectively. The combined use of maternal factors and common antenatal laboratory data of the early second trimester through early third trimester could effectively predict late-onset preeclampsia using machine learning algorithms. Future prospective studies are needed to verify the clinical applicability algorithms.

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1. Flow chart of pattern recognition…
Fig 1. Flow chart of pattern recognition and cluster analysis based variable selection process for late-onset preeclampsia prediction.
Fig 2. Normalized importance of the selected…
Fig 2. Normalized importance of the selected variables for late-onset preeclampsia prediction models.
The plot shows relative importance of the variables in random forest model. IncNodePurity reflects the reduction in entropy, which is the uncertainty, due to sorting of the attribute. Abbreviation: SBP, systolic blood pressure; WBC, white blood cell; UPCR, urine protein to creatinine ratio; UACT, urine albumin to creatinine ratio.
Fig 3. Receiver operating characteristic curves of…
Fig 3. Receiver operating characteristic curves of late-onset preeclampsia prediction models.
C-statistics for each prediction model are presented in the graph. Abbreviation: DT, decision tree; NBC, naïve Bayes classification; SVM, support vector machine; RF, random forest; SGB, stochastic gradient boosting; LR, logistic regression.

References

    1. Mol BWJ, Roberts CT, Thangaratinam S, Magee LA, de Groot CJM, Hofmeyr GJ. Pre-eclampsia. Lancet. 2016;387: 999–1011. 10.1016/S0140-6736(15)00070-7
    1. Ananth CV, Keyes KM, Wapner RJ. Pre-eclampsia rates in the United States, 1980–2010: age-period-cohort analysis. Bmj. 2013;347: f6564 10.1136/bmj.f6564
    1. Saleem S, McClure EM, Goudar SS, Patel A, Esamai F, Garces A, et al. A prospective study of maternal, fetal and neonatal deaths in low- and middle-income countries. Bull World Health Organ. 2014;92: 605–612. 10.2471/BLT.13.127464
    1. Habli M, Eftekhari N, Wiebracht E, Bombrys A, Khabbaz M, How H, et al. Long-term maternal and subsequent pregnancy outcomes 5 years after hemolysis, elevated liver enzymes, and low platelets (HELLP) syndrome. Am J Obstet Gynecol. 2009;201: 385.e381–385. 10.1016/j.ajog.2009.06.033
    1. Nelson DB, Ziadie MS, McIntire DD, Rogers BB, Leveno KJ. Placental pathology suggesting that preeclampsia is more than one disease. Am J Obstet Gynecol. 2014;210: 66.e61–67. 10.1016/j.ajog.2013.09.010
    1. von Dadelszen P, Payne B, Li J, Ansermino JM, Broughton Pipkin F, Cote AM, et al. Prediction of adverse maternal outcomes in pre-eclampsia: development and validation of the fullPIERS model. Lancet. 2011;377: 219–227. 10.1016/S0140-6736(10)61351-7
    1. Payne BA, Hutcheon JA, Ansermino JM, Hall DR, Bhutta ZA, Bhutta SZ, et al. A risk prediction model for the assessment and triage of women with hypertensive disorders of pregnancy in low-resourced settings: the miniPIERS (Pre-eclampsia Integrated Estimate of RiSk) multi-country prospective cohort study. PLoS Med. 2014;11: e1001589 10.1371/journal.pmed.1001589
    1. Thangaratinam S, Allotey J, Marlin N, Mol BW, Von Dadelszen P, Ganzevoort W, et al. Development and validation of Prediction models for Risks of complications in Early-onset Pre-eclampsia (PREP): a prospective cohort study. Health Technol Assess. 2017;21: 1–100. 10.3310/hta21180
    1. North RA, McCowan LM, Dekker GA, Poston L, Chan EH, Stewart AW, et al. Clinical risk prediction for pre-eclampsia in nulliparous women: development of model in international prospective cohort. Bmj. 2011;342: d1875 10.1136/bmj.d1875
    1. Chappell LC, Duckworth S, Seed PT, Griffin M, Myers J, Mackillop L, et al. Diagnostic accuracy of placental growth factor in women with suspected preeclampsia: a prospective multicenter study. Circulation. 2013;128: 2121–2131. 10.1161/CIRCULATIONAHA.113.003215
    1. Zeisler H, Llurba E, Chantraine F, Vatish M, Staff AC, Sennstrom M, et al. Predictive Value of the sFlt-1:PlGF Ratio in Women with Suspected Preeclampsia. N Engl J Med. 2016;374: 13–22. 10.1056/NEJMoa1414838
    1. Obermeyer Z, Emanuel EJ. Predicting the Future—Big Data, Machine Learning, and Clinical Medicine. N Engl J Med. 2016;375: 1216–1219. 10.1056/NEJMp1606181
    1. Darcy AM, Louie AK, Roberts LW. Machine Learning and the Profession of Medicine. Jama. 2016;315: 551–552. 10.1001/jama.2015.18421
    1. Frizzell JD, Liang L, Schulte PJ, Yancy CW, Heidenreich PA, Hernandez AF, et al. Prediction of 30-Day All-Cause Readmissions in Patients Hospitalized for Heart Failure: Comparison of Machine Learning and Other Statistical Approaches. JAMA Cardiol. 2017;2: 204–209. 10.1001/jamacardio.2016.3956
    1. Bottaci L, Drew PJ, Hartley JE, Hadfield MB, Farouk R, Lee PW, et al. Artificial neural networks applied to outcome prediction for colorectal cancer patients in separate institutions. Lancet. 1997;350: 469–472. 10.1016/S0140-6736(96)11196-X
    1. Sharma S, McFann K, Chonchol M, de Boer IH, Kendrick J. Association between dietary sodium and potassium intake with chronic kidney disease in US adults: a cross-sectional study. Am J Nephrol. 2013;37: 526–533. 10.1159/000351178
    1. Nasrabadi NM. Pattern Recognition and Machine Learning. SPIE; 2007.
    1. Akopov AS, Moskovtsev AA, Dolenko SA, Savina GD. [Cluster analysis in biomedical researches]. Patol Fiziol Eksp Ter. 2013: 84–96
    1. Kim W, Kim KS, Lee JE, Noh DY, Kim SW, Jung YS, et al. Development of novel breast cancer recurrence prediction model using support vector machine. J Breast Cancer. 2012;15: 230–238. 10.4048/jbc.2012.15.2.230
    1. Alfred V. Aho JEH, Jeffrey D. Ullman. Data structures and algorithms. Addison-Wesley; Boston: 1983:
    1. Rennie J, Shih, L., Teevan, J., Karger, D Tackling the poor assumptions of Naive Bayes classifiers. Proceedings of the Twentieth International Conference on Machine Learning (ICML-2003). Washington DC. 2003:
    1. Corinna VV C. Support-vector networks. Machine Learning. 1995;20: 273–297
    1. Random Forests L. B. Machine Learning. 2001;45: 5–32
    1. Friedman JH. Stochastic Gradient Boosting. Technical Report. 1999;Stanford University, Stanford:
    1. Cox D. The regression analysis of binary sequences (with discussion). J Roy Stat Soc B. 1958;20: 215–242
    1. Sircar M, Thadhani R, Karumanchi SA. Pathogenesis of preeclampsia. Curr Opin Nephrol Hypertens. 2015;24: 131–138. 10.1097/MNH.0000000000000105
    1. Naljayan MV, Karumanchi SA. New developments in the pathogenesis of preeclampsia. Adv Chronic Kidney Dis. 2013;20: 265–270. 10.1053/j.ackd.2013.02.003
    1. Hypertension in pregnancy. Report of the American College of Obstetricians and Gynecologists' Task Force on Hypertension in Pregnancy. Obstet Gynecol. 2013;122: 1122–1131. 10.1097/01.AOG.0000437382.03963.88
    1. Bramham K, Parnell B, Nelson-Piercy C, Seed PT, Poston L, Chappell LC. Chronic hypertension and pregnancy outcomes: systematic review and meta-analysis. Bmj. 2014;348: g2301 10.1136/bmj.g2301
    1. Cornelis T, Odutayo A, Keunen J, Hladunewich M. The kidney in normal pregnancy and preeclampsia. Semin Nephrol. 2011;31: 4–14. 10.1016/j.semnephrol.2010.10.002
    1. van der Graaf AM, Toering TJ, Faas MM, Lely AT. From preeclampsia to renal disease: a role of angiogenic factors and the renin-angiotensin aldosterone system? Nephrol Dial Transplant. 2012;27 Suppl 3: iii51–57. 10.1093/ndt/gfs278
    1. Ilekis JV, Tsilou E, Fisher S, Abrahams VM, Soares MJ, Cross JC, et al. Placental origins of adverse pregnancy outcomes: potential molecular targets: an Executive Workshop Summary of the Eunice Kennedy Shriver National Institute of Child Health and Human Development. Am J Obstet Gynecol. 2016;215: S1–s46. 10.1016/j.ajog.2016.03.001
    1. Gormley M, Ona K, Kapidzic M, Garrido-Gomez T, Zdravkovic T, Fisher SJ. Preeclampsia: novel insights from global RNA profiling of trophoblast subpopulations. Am J Obstet Gynecol. 2017;217: 200.e201–200.e217. 10.1016/j.ajog.2017.03.017
    1. Leslie K, Thilaganathan B, Papageorghiou A. Early prediction and prevention of pre-eclampsia. Best Pract Res Clin Obstet Gynaecol. 2011;25: 343–354. 10.1016/j.bpobgyn.2011.01.002
    1. Wolak T, Sergienko R, Wiznitzer A, Ben Shlush L, Paran E, Sheiner E. Low potassium level during the first half of pregnancy is associated with lower risk for the development of gestational diabetes mellitus and severe pre-eclampsia. J Matern Fetal Neonatal Med. 2010;23: 994–998. 10.3109/14767050903544736
    1. Brown MA, Wang J, Whitworth JA. The renin-angiotensin-aldosterone system in pre-eclampsia. Clin Exp Hypertens. 1997;19: 713–726. 10.3109/10641969709083181
    1. Udenze IC, Arikawe AP, Azinge EC, Okusanya BO, Ebuehi OA. Calcium and Magnesium Metabolism in Pre-Eclampsia. West Afr J Med. 2014;33: 178–182
    1. Gabbay A, Tzur T, Weintraub AY, Shoham-Vardi I, Sergienko R, Sheiner E. Calcium level during the first trimester of pregnancy as a predictor of preeclampsia. Hypertens Pregnancy. 2014;33: 311–321. 10.3109/10641955.2013.877925
    1. de Sousa Rocha V, Della Rosa FB, Ruano R, Zugaib M, Colli C. Association between magnesium status, oxidative stress and inflammation in preeclampsia: A case-control study. Clin Nutr. 2015;34: 1166–1171. 10.1016/j.clnu.2014.12.001
    1. Wei FF, Li Y, Zhang L, Xu TY, Ding FH, Wang JG, et al. Beat-to-beat, reading-to-reading, and day-to-day blood pressure variability in relation to organ damage in untreated Chinese. Hypertension. 2014;63: 790–796. 10.1161/HYPERTENSIONAHA.113.02681
    1. Bangalore S, Fayyad R, Laskey R, DeMicco DA, Messerli FH, Waters DD. Body-Weight Fluctuations and Outcomes in Coronary Disease. N Engl J Med. 2017;376: 1332–1340. 10.1056/NEJMoa1606148
    1. Sartore G, Chilelli NC, Burlina S, Lapolla A. Association between glucose variability as assessed by continuous glucose monitoring (CGM) and diabetic retinopathy in type 1 and type 2 diabetes. Acta Diabetol. 2013;50: 437–442. 10.1007/s00592-013-0459-9
    1. de Ridder D, de Ridder J, Reinders MJ. Pattern recognition in bioinformatics. Brief Bioinform. 2013;14: 633–647. 10.1093/bib/bbt020
    1. Sansone M, Fusco R, Pepino A, Sansone C. Electrocardiogram pattern recognition and analysis based on artificial neural networks and support vector machines: a review. J Healthc Eng. 2013;4: 465–504. 10.1260/2040-2295.4.4.465
    1. Mahajan R, Viangteeravat T, Akbilgic O. Improved detection of congestive heart failure via probabilistic symbolic pattern recognition and heart rate variability metrics. Int J Med Inform. 2017;108: 55–63. 10.1016/j.ijmedinf.2017.09.006
    1. Correa M, Zimic M, Barrientos F, Barrientos R, Roman-Gonzalez A, Pajuelo MJ, et al. Automatic classification of pediatric pneumonia based on lung ultrasound pattern recognition. PLoS One. 2018;13: e0206410 10.1371/journal.pone.0206410
    1. Sonek J, Krantz D, Carmichael J, Downing C, Jessup K, Haidar Z, et al. First-trimester screening for early and late preeclampsia using maternal characteristics, biomarkers, and estimated placental volume. Am J Obstet Gynecol. 2018;218: 126.e121–126.e113. 10.1016/j.ajog.2017.10.024

Source: PubMed

3
Suscribir