Adding Continuous Vital Sign Information to Static Clinical Data Improves the Prediction of Length of Stay After Intubation: A Data-Driven Machine Learning Approach

David Castiñeira, Katherine R Schlosser, Alon Geva, Amir R Rahmani, Gaston Fiore, Brian K Walsh, Craig D Smallwood, John H Arnold, Mauricio Santillana, David Castiñeira, Katherine R Schlosser, Alon Geva, Amir R Rahmani, Gaston Fiore, Brian K Walsh, Craig D Smallwood, John H Arnold, Mauricio Santillana

Abstract

Background: Bedside monitors in the ICU routinely measure and collect patients' physiologic data in real time to continuously assess the health status of patients who are critically ill. With the advent of increased computational power and the ability to store and rapidly process big data sets in recent years, these physiologic data show promise in identifying specific outcomes and/or events during patients' ICU hospitalization.

Methods: We introduced a methodology designed to automatically extract information from continuous-in-time vital sign data collected from bedside monitors to predict if a patient will experience a prolonged stay (length of stay) on mechanical ventilation, defined as >4 d, in a pediatric ICU.

Results: Continuous-in-time vital signs information and clinical history data were retrospectively collected for 284 ICU subjects from their first 24 h on mechanical ventilation from a medical-surgical pediatric ICU at Boston Children's Hospital. Multiple machine learning models were trained on multiple subsets of these subjects to predict the likelihood that each of these subjects would experience a long stay. We evaluated the predictive power of our models strictly on unseen hold-out validation sets of subjects. Our methodology achieved model performance of >83% (area under the curve) by using only vital sign information as input, and performances of 90% (area under the curve) by combining vital sign information with subjects' static clinical data readily available in electronic health records. We implemented this approach on 300 independently trained experiments with different choices of training and hold-out validation sets to ensure the consistency and robustness of our results in our study sample. The predictive power of our approach outperformed recent efforts that used deep learning to predict a similar task.

Conclusions: Our proposed workflow may prove useful in the design of scalable approaches for real-time predictive systems in ICU environments, exploiting real-time vital sign information from bedside monitors. (ClinicalTrials.gov registration NCT02184208.).

Keywords: big data in medicine; biomedical and health data science; clinical decision making; critical care; data driven machine learning; decision support systems; intensive care; length of stay; length of stay estimation; machine learning; mechanical ventilation; pediatrics; precision medicine; prediction; predictive analytics.

Conflict of interest statement

The authors have disclosed no conflicts of interest.

Copyright © 2020 by Daedalus Enterprises.

Figures

Fig. 1.
Fig. 1.
Proposed methodology for feature engineering. GBT = gradient boosting tree; LOS = length of stay.
Fig. 2.
Fig. 2.
A: Distribution of length of stay (LOS) across all subjects in this study; the inset in this figure shows a more detailed view of this distribution, with a focus on LOS 20 d into one single bin). B: Distribution of pre-ICU admission location where 1 = in-patient surgical floors, 2 = in-patient medical floors, 3 = emergency room, 4 = other ICUs, 5 = operating room (OR)/procedures, 6 = other/unknown. C: Vital signs.
Fig. 3.
Fig. 3.
A: All scenarios with total accuracy. Receiver operating characteristic curves for the 3 data types. B: Static clinical data. C: Time series data. D: Static clinical data plus time series data. The 10th, 50th, and 90th percentile curves are shown (P10, P50, P90).
Fig. 4.
Fig. 4.
Accuracies obtained with gradient boosting trees predictive models. The 10th and 90th percentiles are provided (P10, P90). LOS = Length of stay.

Source: PubMed

3
Abonneren