Integrating host response and unbiased microbe detection for lower respiratory tract infection diagnosis in critically ill adults

Charles Langelier, Katrina L Kalantar, Farzad Moazed, Michael R Wilson, Emily D Crawford, Thomas Deiss, Annika Belzer, Samaneh Bolourchi, Saharai Caldera, Monica Fung, Alejandra Jauregui, Katherine Malcolm, Amy Lyden, Lillian Khan, Kathryn Vessel, Jenai Quan, Matt Zinter, Charles Y Chiu, Eric D Chow, Jenny Wilson, Steve Miller, Michael A Matthay, Katherine S Pollard, Stephanie Christenson, Carolyn S Calfee, Joseph L DeRisi, Charles Langelier, Katrina L Kalantar, Farzad Moazed, Michael R Wilson, Emily D Crawford, Thomas Deiss, Annika Belzer, Samaneh Bolourchi, Saharai Caldera, Monica Fung, Alejandra Jauregui, Katherine Malcolm, Amy Lyden, Lillian Khan, Kathryn Vessel, Jenai Quan, Matt Zinter, Charles Y Chiu, Eric D Chow, Jenny Wilson, Steve Miller, Michael A Matthay, Katherine S Pollard, Stephanie Christenson, Carolyn S Calfee, Joseph L DeRisi

Abstract

Lower respiratory tract infections (LRTIs) lead to more deaths each year than any other infectious disease category. Despite this, etiologic LRTI pathogens are infrequently identified due to limitations of existing microbiologic tests. In critically ill patients, noninfectious inflammatory syndromes resembling LRTIs further complicate diagnosis. To address the need for improved LRTI diagnostics, we performed metagenomic next-generation sequencing (mNGS) on tracheal aspirates from 92 adults with acute respiratory failure and simultaneously assessed pathogens, the airway microbiome, and the host transcriptome. To differentiate pathogens from respiratory commensals, we developed a rules-based model (RBM) and logistic regression model (LRM) in a derivation cohort of 20 patients with LRTIs or noninfectious acute respiratory illnesses. When tested in an independent validation cohort of 24 patients, both models achieved accuracies of 95.5%. We next developed pathogen, microbiome diversity, and host gene expression metrics to identify LRTI-positive patients and differentiate them from critically ill controls with noninfectious acute respiratory illnesses. When tested in the validation cohort, the pathogen metric performed with an area under the receiver-operating curve (AUC) of 0.96 (95% CI, 0.86-1.00), the diversity metric with an AUC of 0.80 (95% CI, 0.63-0.98), and the host transcriptional classifier with an AUC of 0.88 (95% CI, 0.75-1.00). Combining these achieved a negative predictive value of 100%. This study suggests that a single streamlined protocol offering an integrated genomic portrait of pathogen, microbiome, and host transcriptome may hold promise as a tool for LRTI diagnosis.

Keywords: lower respiratory tract infection; mechanical ventilation; next-generation sequencing; pneumonia; transcriptome.

Conflict of interest statement

The authors declare no conflict of interest.

Copyright © 2018 the Author(s). Published by PNAS.

Figures

Fig. 1.
Fig. 1.
Study overview and analysis workflow. Patients with acute respiratory failure were enrolled within 72 h of ICU admission, and TA samples were collected and underwent both RNA sequencing (RNA-seq) and shotgun DNA sequencing (DNA-seq). Post hoc clinical adjudication blinded to mNGS results identified patients with LRTI defined by clinical and microbiologic criteria (LRTI+C+M); LRTI defined by clinical criteria only (LRTI+C); patients with noninfectious reasons for acute respiratory failure (no-LRTI); and respiratory failure due to unknown cause (unk-LRTI). The LRTI+C+M and no-LRTI groups were divided into derivation and validation cohorts. To detect pathogens and differentiate them from a background of commensal microbiota, we developed two models: a rules-based model (RBM) and a logistic regression model (LRM). LRTI probability was next evaluated with (i) a pathogen metric, (ii) a lung microbiome diversity metric, and (iii) a 12-gene host transcriptional classifier. Models were then combined and optimized for LRTI rule out.
Fig. 2.
Fig. 2.
Workflow for distinguishing LRTI pathogens from commensal respiratory microbiota using an algorithmic approach. (A) Projection of microbial relative abundance in log reads per million reads sequenced (rpm) by RNA sequencing (RNA-seq) (x axis) versus DNA sequencing (DNA-seq) (y axis) for representative cases. In the LRTI+C+M group, pathogens identified by standard clinical microbiology (filled shapes) had higher overall relative abundance compared with other taxa detected by sequencing (open shapes). The largest score differential between ranked microbes (max Δrpm) was used as a threshold to identify high-scoring taxa, distinct from the other microbes based on abundance (line with arrows). Red indicates taxa represented in the reference list of established LRTI pathogens. (B) Receiver operating characteristic (ROC) curve demonstrating logistic regression model (LRM) performance for detecting pathogens versus commensal microbiota in both the derivation and validation cohorts. The gray ROC curve and shaded region indicate results from 1,000 rounds of training and testing on randomized sets the derivation cohort. The blue and green lines indicate predictions using leave-one-patient-out cross-validation (LOPO-CV) on the derivation and validation on the validation cohort, respectively. (C) Microbes predicted by the LRM to represent putative pathogens. The x axis represents combined RNA-seq and DNA-seq relative abundance, and the y axis indicates pathogen probability. The dashed line reflects the optimized probability threshold for pathogen assignment. Red filled circles: microbes predicted by LRM to represent putative LRTI pathogens that were also identified by conventional microbiologic tests. Blue filled circles: microbes predicted to represent putative LRTI pathogens by LRM only. Blue open circles: microbes identified by NGS but not predicted by the LRM to represent putative pathogens. Red open circles: microbes identified using NGS and by standard microbiologic testing but not predicted to be putative pathogens. Dark red outlined circles: microbes detected as part of a polymicrobial culture.
Fig. 3.
Fig. 3.
Distribution of respiratory pathogens identified in patients using clinician-ordered diagnostics versus mNGS. Number of subjects in whom each respiratory microbe was detected. All microbes detected by clinician-ordered diagnostics were detected by mNGS; however, pink bars indicate microbes misclassified as negative by either the RBM or LRM. Notably, all microbes identified by clinician-ordered diagnostics and misclassified by either the RBM or LRM (pink bars) were found in polymicrobial cultures, highlighting the presence of dominant pathogens by NGS that are not captured in the polymicrobial culture results. Red bars indicate microbes detected by clinician-ordered diagnostics and also predicted as pathogens by either the RBM or LRM. More detail on which model identified each microbe can be found in SI Appendix, Fig. S2. Dark red bars (LRTI+C+M and LRTI+C subjects) and gray bars (no-LRTI subjects) indicate number of cases with microbes detected only by mNGS.
Fig. 4.
Fig. 4.
Diversity of the transcriptionally active lung microbiome in patients with LRTI (LRTI+C+M) versus noninfectious respiratory illnesses (no-LRTI). (A) Box plots of Shannon diversity index (SDI) of the lung microbiome assessed by RNA-seq at the genus level (in the derivation cohort) differed between LRTI+C+M from no-LRTI groups. (B) The β diversity assessed by PERMANOVA on Bray–Curtis dissimilarity values in the derivation cohort differed between LRTI+C+M and no-LRTI groups. (C) ROC curve demonstrating performance of SDI to distinguish LRTI+C+M from no-LRTI groups.
Fig. 5.
Fig. 5.
Host transcriptional profiling distinguishes patients with acute LRTI (LRTI+C+M) from those with noninfectious acute respiratory illness (no-LRTI). (A) Host classifier scores for all patients in the derivation and validation cohorts; each bar indicates a patient score and is colored as follows: LRTI+C+M, red; no-LRTI, blue. Orange dotted line indicates the host classifier threshold (score, −4) that achieved 100% sensitivity in the training set and was used to classify the test set samples. (B) Normalized expression levels, arranged by unsupervised hierarchical clustering, reflect overexpression (blue) or underexpression (turquoise) of classifier genes (rows) for each patient (columns). Twelve genes were identified as predictive in the derivation cohort and subsequently applied to predict LRTI status in the validation cohort. Column colors above the heatmap indicate whether a patient belonged to the derivation cohort (dark gray) or validation cohort (light gray) and whether they were adjudicated to have LRTI+C+M (red) or no-LRTI (blue). (C) ROC curves demonstrating host classifier performance for derivation (blue) and validation (green) cohorts.
Fig. 6.
Fig. 6.
Combined LRTI prediction metric integrating pathogen detection and host gene expression. (A) Scores per patient for each of the two components of this LRTI rule-out model are projected into a scatterplot (x axis represents the host metric; y axis represents the microbe score). The thresholds optimized for sensitivity in the derivation cohort are indicated in gray dashed line. Each point represents one patient—those that were in the derivation cohort have no fill, and those that were in the validation cohort are filled. Red indicates LRTI+C+M, and blue indicates no-LRTI subjects. (B) LRTI rule-out model results for each patient are shown for both the derivation and validation cohorts, with study subjects shown in rows and metrics in columns. Dark gray indicates a metric exceeded the optimized LRTI threshold; light gray indicates it did not. Dark red indicates the subject was positive for both pathogen-plus-host metrics, and thus was classified as having LRTI. White indicates missing data.

References

    1. World Health Organization 2017 The top 10 causes of death. Available at . Accessed October, 1, 2018.
    1. US Centers for Disease Control and Prevention 2018 Deaths: Leading Causes for 2016. Available at . Accessed October 1, 2018.
    1. El Bcheraoui C, et al. Trends and patterns of differences in infectious disease mortality among US counties, 1980–2014. JAMA. 2018;319:1248–1260.
    1. Jain S, et al. CDC EPIC Study Team Community-acquired pneumonia requiring hospitalization among U.S. Adults. N Engl J Med. 2015;373:415–427.
    1. Zaas AK, et al. The current epidemiology and clinical decisions surrounding acute respiratory infections. Trends Mol Med. 2014;20:579–588.
    1. Wilson MR, et al. Actionable diagnosis of neuroleptospirosis by next-generation sequencing. N Engl J Med. 2014;370:2408–2417.
    1. Leffler DA, Lamont JT. Clostridium difficile infection. N Engl J Med. 2015;372:1539–1548.
    1. Ranzani OT, et al. New sepsis definition (Sepsis-3) and community-acquired pneumonia mortality. A validation and clinical decision-making study. Am J Respir Crit Care Med. 2017;196:1287–1297.
    1. Bibby K. Metagenomic identification of viral pathogens. Trends Biotechnol. 2013;31:275–279.
    1. Yozwiak NL, et al. Virus identification in unknown tropical febrile illness cases using deep sequencing. PLoS Negl Trop Dis. 2012;6:e1485.
    1. Fischer N, et al. Evaluation of unbiased next-generation sequencing of RNA (RNA-seq) as a diagnostic method in influenza virus-positive respiratory samples. J Clin Microbiol. 2015;53:2238–2250.
    1. Graf EH, et al. Unbiased detection of respiratory viruses by use of RNA sequencing-based metagenomics: A systematic comparison to a commercial PCR panel. J Clin Microbiol. 2016;54:1000–1007.
    1. Wilson MR, et al. Diagnosing Balamuthia mandrillaris encephalitis with metagenomic deep sequencing. Ann Neurol. 2015;78:722–730.
    1. Wilson, et al. Chronic meningitis investigated via metagenomic next-generation sequencing. Jama Neurol. 2018;75:947–955.
    1. Naccache SN, et al. A cloud-compatible bioinformatics pipeline for ultrarapid pathogen identification from next-generation sequencing of clinical samples. Genome Res. 2014;24:1180–1192.
    1. Tsalik EL, et al. Host gene expression classifiers diagnose acute respiratory illness etiology. Sci Transl Med. 2016;8:322ra11.
    1. Suarez NM, et al. Superiority of transcriptional profiling over procalcitonin for distinguishing bacterial from viral lower respiratory tract infections in hospitalized adults. J Infect Dis. 2015;212:213–222.
    1. Tsalik EL, McClain M, Zaas AK. Moving toward prime time: Host signatures for diagnosis of respiratory infections. J Infect Dis. 2015;212:173–175.
    1. US Centers for Disease Control and Prevention 2017 CDC/NHSN surveillance definitions for specific types of infections. Available at . Accessed October, 1, 2018.
    1. Langelier C, et al. Metagenomic sequencing detects respiratory pathogens in hematopoietic cellular transplant patients. Am J Respir Crit Care Med. 2018;197:524–528.
    1. Doan T, et al. Illuminating uveitis: Metagenomic deep sequencing identifies common and rare pathogens. Genome Med. 2016;8:90.
    1. Dickson RP, et al. Bacterial topography of the healthy human lower respiratory tract. MBio. 2017;8:e02287-16.
    1. Panzer AR, et al. Lung microbiota is related to smoking status and to development of acute respiratory distress syndrome in critically ill trauma patients. Am J Respir Crit Care Med. 2018;197:621–631.
    1. Morris A, et al. Lung HIV Microbiome Project Comparison of the respiratory microbiome in healthy nonsmokers and smokers. Am J Respir Crit Care Med. 2013;187:1067–1075.
    1. Segal LN, et al. Enrichment of the lung microbiome with oral taxa is associated with lung inflammation of a Th17 phenotype. Nat Microbiol. 2016;1:16031.
    1. Heinonen S, et al. Rhinovirus detection in symptomatic and asymptomatic children: Value of host transcriptome analysis. Am J Respir Crit Care Med. 2016;193:772–782.
    1. Wertheim HFL, et al. The role of nasal carriage in Staphylococcus aureus infections. Lancet Infect Dis. 2005;5:751–762.
    1. McCullers JA. The co-pathogenesis of influenza viruses with bacteria in the lung. Nat Rev Microbiol. 2014;12:252–262.
    1. Magill SS, et al. Emerging Infections Program Healthcare-Associated Infections and Antimicrobial Use Prevalence Survey Team Multistate point-prevalence survey of health care-associated infections. N Engl J Med. 2014;370:1198–1208.
    1. Kalil AC, et al. Management of adults with hospital-acquired and ventilator-associated pneumonia: 2016 clinical practice guidelines by the Infectious Diseases Society of America and the American Thoracic Society. Clin Infect Dis. 2016;63:e61–e111.
    1. Mandell LA, et al. Infectious Diseases Society of America; American Thoracic Society Infectious Diseases Society of America/American Thoracic Society consensus guidelines on the management of community-acquired pneumonia in adults. Clin Infect Dis. 2007;44(Suppl 2):S27–S72.
    1. Cillóniz C, Civljak R, Nicolini A, Torres A. Polymicrobial community-acquired pneumonia: An emerging entity. Respirology. 2016;21:65–75.
    1. Pabbaraju K, et al. Detection of influenza C virus by a real-time RT-PCR assay. Influenza Other Respir Viruses. 2013;7:954–960.
    1. Dewhirst FE, et al. The human oral microbiome. J Bacteriol. 2010;192:5002–5017.
    1. Chen C, et al. New microbiota found in sputum from patients with community-acquired pneumonia. Acta Biochim Biophys Sin (Shanghai) 2013;45:1039–1048.
    1. Ichinohe T, et al. Microbiota regulates immune defense against respiratory tract influenza A virus infection. Proc Natl Acad Sci USA. 2011;108:5354–5359.
    1. Abreu NA, et al. Sinus microbiome diversity depletion and Corynebacterium tuberculostearicum enrichment mediates rhinosinusitis. Sci Transl Med. 2012;4:151ra124.
    1. Dickson RP, et al. Analysis of culture-dependent versus culture-independent techniques for identification of bacteria in clinically obtained bronchoalveolar lavage fluid. J Clin Microbiol. 2014;52:3605–3613.
    1. Flanagan JL, et al. Loss of bacterial diversity during antibiotic treatment of intubated patients colonized with Pseudomonas aeruginosa. J Clin Microbiol. 2007;45:1954–1962.
    1. Birtel J, Walser J-C, Pichon S, Bürgmann H, Matthews B. Estimating bacterial diversity for ecological studies: Methods, metrics, and assumptions. PLoS One. 2015;10:e0125356.
    1. Bray JR, Curtis JT. An ordination of the upland forest communities of southern Wisconsin. Ecol Monogr. 1957;27:325–349.
    1. Sweeney TE, Wong HR, Khatri P. Robust classification of bacterial and viral infections via integrated host gene expression diagnostics. Sci Transl Med. 2016;8:346ra91.
    1. Chen J, Bardes EE, Aronow BJ, Jegga AG. ToppGene Suite for gene list enrichment analysis and candidate gene prioritization. Nucleic Acids Res. 2009;37:W305–W311.
    1. Macian F. NFAT proteins: Key regulators of T-cell development and function. Nat Rev Immunol. 2005;5:472–484.
    1. Fu M, Blackshear PJ. RNA-binding proteins in immune regulation: A focus on CCCH zinc finger proteins. Nat Rev Immunol. 2017;17:130–143.
    1. Biswas K, et al. Differentially regulated host proteins associated with chronic rhinosinusitis are correlated with the sinonasal microbiome. Front Cell Infect Microbiol. 2017;7:504.
    1. Stewart CR, et al. CD36 ligands promote sterile inflammation through assembly of a Toll-like receptor 4 and 6 heterodimer. Nat Immunol. 2010;11:155–161.
    1. Cohen TS, et al. S. aureus blocks efferocytosis of neutrophils by macrophages through the activity of its virulence factor alpha toxin. Sci Rep. 2016;6:35466.
    1. Baranano DE, Rao M, Ferris CD, Snyder SH. Biliverdin reductase: A major physiologic cytoprotectant. Proc Natl Acad Sci USA. 2002;99:16093–16098.
    1. Leidi M, Mariotti M, Maier JAM. EDF-1 contributes to the regulation of nitric oxide release in VEGF-treated human endothelial cells. Eur J Cell Biol. 2010;89:654–660.
    1. Pousada G, Baloira A, Fontán D, Núñez M, Valverde D. Mutational and clinical analysis of the ENG gene in patients with pulmonary arterial hypertension. BMC Genet. 2016;17:72.
    1. Newman AM, et al. Robust enumeration of cell subsets from tissue expression profiles. Nat Methods. 2015;12:453–457.
    1. Currie CJ, et al. Antibiotic treatment failure in four common infections in UK primary care 1991–2012: Longitudinal analysis. BMJ. 2014;349:g5493.
    1. Jain S, Finelli L. CDC EPIC Study Team Community-acquired pneumonia among U.S. children. N Engl J Med. 2015;372:2167–2168.
    1. Walter JM, Wunderink RG. Severe respiratory viral infections: New evidence and changing paradigms. Infect Dis Clin North Am. 2017;31:455–474.
    1. Sands KM, et al. Respiratory pathogen colonization of dental plaque, the lower airways, and endotracheal tube biofilms during mechanical ventilation. J Crit Care. 2017;37:30–37.
    1. Dobbin KK, Zhao Y, Simon RM. How large a training set is needed to develop a classifier for microarray data? Clin Cancer Res. 2008;14:108–114.
    1. Gu W, et al. Depletion of Abundant Sequences by Hybridization (DASH): Using Cas9 to remove unwanted high-abundance species in sequencing libraries and molecular counting applications. Genome Biol. 2016;17:41.
    1. Dobin A, et al. STAR: Ultrafast universal RNA-seq aligner. Bioinformatics. 2013;29:15–21.
    1. Langmead B, Salzberg SL. Fast gapped-read alignment with Bowtie 2. Nat Methods. 2012;9:357–359.
    1. Oksanen J, et al. 2016 vegan: Community Ecology Package. R Package, Version 2.3-5. Available at . Accessed October 1, 2017.
    1. R Core Team 2013 R: A Language and Environment for Statistical Computing, Version 3.4.0 (R Foundation for Statistical Computing, Vienna). Available at . Accessed October 1, 2017.
    1. Ruby JG, Bellare P, Derisi JL. PRICE: Software for the targeted assembly of components of (Meta) genomic sequence data. G3 (Bethesda) 2013;3:865–880.
    1. Love MI, Huber W, Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014;15:550.
    1. Fine MJ, et al. A prediction rule to identify low-risk patients with community-acquired pneumonia. N Engl J Med. 1997;336:243–250.

Source: PubMed

3
Subskrybuj