What's in a Note? Unpacking Predictive Value in Clinical Note Representations

Willie Boag, Dustin Doss, Tristan Naumann, Peter Szolovits, Willie Boag, Dustin Doss, Tristan Naumann, Peter Szolovits

Abstract

Electronic Health Records (EHRs) have seen a rapid increase in adoption during the last decade. The narrative prose contained in clinical notes is unstructured and unlocking its full potential has proved challenging. Many studies incorporating clinical notes have applied simple information extraction models to build representations that enhance a downstream clinical prediction task, such as mortality or readmission. Improved predictive performance suggests a "good" representation. However, these extrinsic evaluations are blind to most of the insight contained in the notes. In order to better understand the power of expressive clinical prose, we investigate both intrinsic and extrinsic methods for understanding several common note representations. To ensure replicability and to support the clinical modeling community, we run all experiments on publicly-available data and provide our code.

Figures

Figure 1.
Figure 1.
An example clinical note. The age, gender, and admitting diagnosis have been highlighted. Also note, thatdescriptions such as “status worsening” suggest deterioration and possible in-hospital mortality.
Figure 2.
Figure 2.
A patient’s time in the ICU generates a sequence of timestamped notes. Each of the methods described transforms the sequence of notes into a fixed-length vector representing the ICU stay.
Figure 3.
Figure 3.
How the embedding for a single document is built by combining constituent word embeddings.
Figure 4.
Figure 4.
The many-to-one prediction task for the LSTM, in which a document representation is fed in at eachtimestep, and it makes a prediction (e.g. diagnosis) at the end of the sequence.
Figure 5.
Figure 5.
PCA 2-D projection of the word embeddings. Vectors of the special age tokens are colored red. Note thatthese tokens cluster close together in the embedding.

References

    1. Ghassemi M, Naumann T, Doshi-Velez F, Brimmer N, Joshi R, Rumshisky A, et al. Unfolding physiological state: Mortality modelling in intensive care units. In: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining; 2014; ACM. pp. 75–84.
    1. Caballero Barajas KL, Akella R. Dynamically modeling patient’s health state from electronic medical records: A time series approach. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; 2015; ACM. pp. 69–78.
    1. Rumshisky A, Ghassemi M, Naumann T, Szolovits P, Castro V, McCoy T, et al. Predicting early psychiatric readmission with natural language processing of narrative discharge summaries. Translational Psychiatry. 2016;6(10):e921.
    1. Grnarova P, Schmidt F, Hyland S, Eickhoff C. Neural Document Embeddings for Intensive Care Patient Mortality Prediction. In: NIPS 2016 Workshop on Machine Learning for Health Workshop. 2016
    1. Ghassemi M, Naumann T, Joshi R, Rumshisky A. Topic Models for Mortality Modeling in Intensive Care Units. In: ICML 2012 Machine Learning for Clinical Data Analysis Workshop. 2012
    1. Lehman Lw, Saeed M, Long W, Lee J, Mark R. Risk stratification of ICU patients using topic models inferred from unstructured progress notes. In: AMIA annual symposium proceedings; American Medical Informatics Association. 2012. p. 505.
    1. US Centers for Disease Control and Prevention. Health Disparities in HIV/AIDS, Viral Hepatitis, STDs, and TB; [Accessed September 26, 2017]. .
    1. Luo YF, Rumshisky A. Interpretable Topic Features for Post-ICU Mortality Prediction. In: AMIA Annual Symposium Proceedings; American Medical Informatics Association. 2016. p. 827.
    1. Le Q, Mikolov T. Distributed representations of sentences and documents. In: Proceedings of the 31st International Conference on Machine Learning (ICML-14); 2014:1188–1196.
    1. Cohen R, Aviram I, Elhadad M, Elhadad N. Redundancy-aware topic modeling for patient record notes. PloS one. 2014;9(2):e87555.
    1. Pivovarov R, Perotte AJ, Grave E, Angiolillo J, Wiggins CH, Elhadad N. Learning probabilistic phenotypes from heterogeneous EHR data. Journal of biomedical informatics. 2015;58:156–165.
    1. Banko M, Brill E. Scaling to very very large corpora for natural language disambiguation. In: Proceedings of the 39th annual meeting on association for computational linguistics; Association for Computational Linguistics. 2001. pp. 26–33.
    1. Halevy A, Norvig P, Pereira F. The unreasonable effectiveness of data. IEEE Intelligent Systems. 2009;24(2):8–12.
    1. Johnson AE, Pollard TJ, Shen L, Lehman LwH, Feng M, Ghassemi M, et al. MIMIC-III, a freely accessible critical care database. Scientific data. 2016:3.
    1. Johnson AE, Pollard TJ, Mark RG. Reproducibility in critical care: a mortality prediction case study. In: Proceedings of Machine Learning for Healthcare 2017. 2017
    1. Levy O, Goldberg Y, Dagan I. Improving Distributional Similarity with Lessons Learned from Word Embeddings. In: Transactions of the Association for Computational Linguistics. 2015:211–225.
    1. Hochreiter S, Schmidhuber J. Long short-term memory. Neural computation. 1997;8:1735–1780.
    1. Kosko B. Bidirectional Associative Memories. IEEE Trans SystMan Cybern. 1988 Jan;18(1):49–60.
    1. Chollet F, et al. Keras. GitHub. 2015. .
    1. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, et al. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research. 2011;12:2825–2830.

Source: PubMed

3
Abonnere