Machine learning models predicting multidrug resistant urinary tract infections using "DsaaS"

Alessio Mancini, Leonardo Vito, Elisa Marcelli, Marco Piangerelli, Renato De Leone, Sandra Pucciarelli, Emanuela Merelli, Alessio Mancini, Leonardo Vito, Elisa Marcelli, Marco Piangerelli, Renato De Leone, Sandra Pucciarelli, Emanuela Merelli

Abstract

Background: The scope of this work is to build a Machine Learning model able to predict patients risk to contract a multidrug resistant urinary tract infection (MDR UTI) after hospitalization. To achieve this goal, we used different popular Machine Learning tools. Moreover, we integrated an easy-to-use cloud platform, called DSaaS (Data Science as a Service), well suited for hospital structures, where healthcare operators might not have specific competences in using programming languages but still, they do need to analyze data as a continuous process. Moreover, DSaaS allows the validation of data analysis models based on supervised Machine Learning regression and classification algorithms.

Results: We used DSaaS on a real antibiotic stewardship dataset to make predictions about antibiotic resistance in the Clinical Pathology Operative Unit of the Principe di Piemonte Hospital in Senigallia, Marche, Italy. Data related to a total of 1486 hospitalized patients with nosocomial urinary tract infection (UTI). Sex, age, age class, ward and time period, were used to predict the onset of a MDR UTI. Machine Learning methods such as Catboost, Support Vector Machine and Neural Networks were utilized to build predictive models. Among the performance evaluators, already implemented in DSaaS, we used accuracy (ACC), area under receiver operating characteristic curve (AUC-ROC), area under Precision-Recall curve (AUC-PRC), F1 score, sensitivity (SEN), specificity and Matthews correlation coefficient (MCC). Catboost exhibited the best predictive results (MCC 0.909; SEN 0.904; F1 score 0.809; AUC-PRC 0.853, AUC-ROC 0.739; ACC 0.717) with the highest value in every metric.

Conclusions: the predictive model built with DSaaS may serve as a useful support tool for physicians treating hospitalized patients with a high risk to acquire MDR UTIs. We obtained these results using only five easy and fast predictors accessible for each patient hospitalization. In future, DSaaS will be enriched with more features like unsupervised Machine Learning techniques, streaming data analysis, distributed calculation and big data storage and management to allow researchers to perform a complete data analysis pipeline. The DSaaS prototype is available as a demo at the following address: https://dsaas-demo.shinyapps.io/Server/.

Keywords: Antibiotic stewardship; Classification; Data science pipeline; Machine learning; Multi drug resistance; Nosocomial infection; Regression.

Conflict of interest statement

All authors have read and approved the final manuscript, and none of them have financial or competing interests.

Figures

Fig. 1
Fig. 1
DSaaS future architecture. In dark gray are shown the operative modules described in this paper and already operative. In light gray are showed the modules that will be implemented in the future to perform data flow editing, R scripting and a Stewardship UI

References

    1. Mancini A, Pucciarelli S, Lombardi FE, Barocci S, Pauri P, Lodolini S. Differences between community- and hospital-acquired urinary tract infections in a tertiary care hospital. New Microbiol. 2019;9:43.
    1. Tlachac ML, Rundensteiner E, Barton K, Troppy S, Beaulac K, Doron S. Predicting future antibiotic susceptibility using regression-based methods on longitudinal Massachusetts Antibiogram data. Biostec. 2018;5:978–989.
    1. Barlam TF, Cosgrove SE, Abbo LM, Macdougall C, Schuetz AN, Septimus EJ, et al. Implementing an antibiotic stewardship program: guidelines by the Infectious Diseases Society of America and the Society for Healthcare Epidemiology of America. Clin Infect Dis. 2016;62(10):e51–e77. doi: 10.1093/cid/ciw118.
    1. Naber KG, Bergman B, Bishop MC, Bjerklund-Johansen TE, Botto H, Lobel B, et al. EAU guidelines for the management of urinary and male genital tract infections. Urinary tract infection [UTI] working Group of the Health Care Office [HCO] of the European Association of Urology [EAU] Eur Urol. 2015;40(5):576–588. doi: 10.1159/000049840.
    1. Maki DG, Tambyah PA. Engineering out the risk for infection with urinary catheters. Emerg Infect Dis. 2001;7(2):342–347. doi: 10.3201/eid0702.010240.
    1. Foxman B. The epidemiology of urinary tract infection. Nat Rev Urol. 2010;7(12):653–660. doi: 10.1038/nrurol.2010.190.
    1. Woodford HJ, George J. Diagnosis and management of urinary infections in older people. Clin Med J R Coll Phys London. 2011;11(1):80–83.
    1. Lateef F. Hospital design for better infection control. J Emerg Trauma Shock. 2009;2:175–179. doi: 10.4103/0974-2700.55329.
    1. Ventola CL. The antibiotic resistance crisis: part 1: causes and threats. P T A Peer-Rev J Formul Manag. 2015;40(4):277–283.
    1. Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP. SMOTE: synthetic minority over-sampling technique. J Artif Intell Res. 2002;16:321–357. doi: 10.1613/jair.953.
    1. Vapnik VN. An overview of statistical learning theory. IEEE Trans Neural Netw. 1999;10:988–999. doi: 10.1109/72.788640.
    1. Dorogush AV, Ershov V, Gulin A. CatBoost: gradient boosting with categorical features support. 2018. pp. 1–11.
    1. Haykin S. Neural networks: a comprehensive foundation. Knowl Eng Rev. 1994;13:409–412.
    1. De Leone R, Capparuccia R, Merelli E. A successive overrelaxation backpropagation algorithm for neural-network training. IEEE Trans Neural Netw. 1998;9:381–388. doi: 10.1109/72.668881.
    1. Kuhn M. Building predictive models in R using the caret package. J Stat Softw. 2008;28:1–26. doi: 10.18637/jss.v028.i05.
    1. Rodriguez A. Restful web services: the basics. Online artic IBM dev tech Libr. 2008.
    1. Peng G, Ritchey NA, Casey KS, Kearns EJ, Privette JL, Saunders D, et al. Scientific stewardship in the open data and big data era - roles and responsibilities of stewards and other major product stakeholders. D-Lib Mag. 2016;22.
    1. CDC, NHSN CDC / NHSN surveillance definitions for specific types of infections. Surveill Defin. 2014;36(5):309–332.
    1. Siegel JD, Rhinehart E, Jackson M, Chiarello L. Management of multidrug-resistant organisms in health care settings, 2006. Am J Infect Control. 2007;35(10 Suppl 2):S165–S193. doi: 10.1016/j.ajic.2007.10.006.
    1. Flores-Mireles AL, Walker JN, Caparon M, Hultgren SJ. Urinary tract infections: epidemiology, mechanisms of infection and treatment options. Nat Rev Microbiol. 2015;13(5):269–284. doi: 10.1038/nrmicro3432.
    1. Scrucca L. GA : a package for genetic algorithms in R. J Stat Softw. 2015;53:1.
    1. Little MA, Varoquaux G, Saeb S, Lonini L, Jayaraman A, Mohr DC, et al. Using and understanding cross-validation strategies. Perspectives on Saeb et al Gigascience. 2017;6(5):1–6.
    1. Kuhn M, Johnson K. Applied predictive modeling. Applied predictive modeling. 2013.
    1. Bartocci E, Cacciagrano D, Cannata N, Corradini F, Merelli E, Milanesi L, et al. An agent-based multilayer architecture for bioinformatics grids. IEEE Transact Nanobiosci. 2007;6:142–148. doi: 10.1109/TNB.2007.897492.
    1. Piangerelli M, Rucco M, Tesei L, Merelli E. Topological classifier for detecting the emergence of epileptic seizures. BMC Res Notes. 2018;11:392. doi: 10.1186/s13104-018-3482-7.
    1. Piangerelli M, Maestri S, Merelli E. Visualizing 2-simplex formation of metabolic reactions. Submitted to JMGM. 2020.
    1. Mancini A, Eyassu F, Conway M, Occhipinti A, Liò P, Angione C, et al. CiliateGEM: an open-project and a tool for predictions of ciliate metabolic variations and experimental condition design. BMC Bioinformatics. 2018;19(Suppl 15):442. doi: 10.1186/s12859-018-2422-9.
    1. Alanazi HO, Abdullah AH, Qureshi KN. A critical review for developing accurate and dynamic predictive models using machine learning methods in medicine and health care. J Med Syst. 2017;41(4):69. doi: 10.1007/s10916-017-0715-6.
    1. Bhandari M, Giannoudis PV. Evidence-based medicine: what it is and what it is not. Injury. 2006;37(4):302–306. doi: 10.1016/j.injury.2006.01.034.
    1. Scott IA. Machine learning and evidence-based medicine. Ann Intern Med. 2018;1:1.
    1. Takaya S, Rehmsmeier M. The precision-recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets. PLoS One. 2015;1:e0118432.
    1. Chicco D, Jurman G. The advantages of the Matthews correlation coefficient [MCC] over F1 score and accuracy in binary classification evaluation. BMC Genomics. 2020;21:6. doi: 10.1186/s12864-019-6413-7.
    1. Austenfeld M. A graphical user Interface for R in a rich client platform for ecological modeling. J Stat Softw. 2012;49:1. doi: 10.18637/jss.v049.i04.
    1. Zou H, Li G. Diagnosis, prevention, and treatment of catheter-associated urinary tract infection in adults: 2009 international clinical practice guidelines from the Infectious Diseases Society of America. Chin J Infect Chemother. 2010;50:625.

Source: PubMed

3
Se inscrever