Critical Relevance of Stochastic Effects on Low-Bacterial-Biomass 16S rRNA Gene Analysis

John R Erb-Downward, Nicole R Falkowski, Jennifer C D'Souza, Lisa M McCloskey, Roderick A McDonald, Christopher A Brown, Kerby Shedden, Robert P Dickson, Christine M Freeman, Kathleen A Stringer, Betsy Foxman, Gary B Huffnagle, Jeffrey L Curtis, Sara D Adar, John R Erb-Downward, Nicole R Falkowski, Jennifer C D'Souza, Lisa M McCloskey, Roderick A McDonald, Christopher A Brown, Kerby Shedden, Robert P Dickson, Christine M Freeman, Kathleen A Stringer, Betsy Foxman, Gary B Huffnagle, Jeffrey L Curtis, Sara D Adar

Abstract

The bacterial microbiome of human body sites, previously considered sterile, remains highly controversial because it can be challenging to isolate signal from noise when low-biomass samples are being analyzed. We tested the hypothesis that stochastic sequencing noise, separable from reagent contamination, is generated during sequencing on the Illumina MiSeq platform when DNA input is below a critical threshold. We first purified DNA from serial dilutions of Pseudomonas aeruginosa and from negative controls using three DNA purification kits, quantified input using droplet digital PCR, and then sequenced the 16S rRNA gene in four technical replicates. This process identified reproducible contaminant signal that was separable from an irreproducible stochastic noise, which occurred as bacterial biomass of samples decreased. This approach was then applied to authentic respiratory samples from healthy individuals (n = 22) that ranged from high to ultralow bacterial biomass. Using oral rinse, bronchoalveolar lavage (BAL) fluid, and exhaled breath condensate (EBC) samples and matched controls, we were able to demonstrate (i) that stochastic noise dominates sequencing in real-world low-bacterial-biomass samples that contain fewer than 104 copies of the 16S rRNA gene per sample, (ii) that critical examination of the community composition of technical replicates can be used to separate signal from noise, and (iii) that EBC is an irreproducible sampling modality for sampling the microbiome of the lower airways. We anticipate that these results combined with suggested methods for identifying and dealing with noisy communities will facilitate increased reproducibility while simultaneously permitting characterization of potentially important low-biomass communities.IMPORTANCE DNA contamination from external sources (reagents, environment, operator, etc.) has long been assumed to be the main cause of spurious signals that appear under low-bacterial-biomass conditions. Here, we demonstrate that contamination can be separated from another, random signal generated during low-biomass-sample sequencing. This stochastic noise is not reproduced between technical replicates; however, results for any one replicate taken alone could look like a microbial community different from the controls. Using this information, we investigated respiratory samples from healthy humans and determined the narrow range of bacterial biomass where samples transition from producing reproducible microbial sequences to ones dominated by noise. We present a rigorous approach to studies involving low-bacterial-biomass samples to detect this source of noise and provide a framework for deciding if a sample is likely to be dominated by noise. We anticipate that this work will facilitate increased reproducibility in the characterization of potentially important low-biomass communities.

Keywords: 16S rRNA gene; contamination; exhaled breath condensate; low biomass; lung microbiome; next-generation sequencing; sequencing noise.

Copyright © 2020 Erb-Downward et al.

Figures

FIG 1
FIG 1
Serial dilutions of P. aeruginosa DNA purified using three separate DNA isolation kits and sequenced in quadruplicate: effects of low biomass on results. (A) An idealized model of contamination effects on a single sample purified using 3 separate DNA isolation kits and 3 technical replicates for each of those kits. Contamination within each kit is assumed to be 100% different from that of reagents in all other kits. The distribution of a hypothetical similarity score is plotted along the x axis for similarity between technical replicates (in red) and between kits (in blue). Where overlap occurs, the curves appear purple. The first column depicts the condition of high concentrations of DNA, the second column depicts low concentrations of DNA where kit or reagent contamination is dominant, and the third column depicts low concentrations of DNA where random noise dominates. (B) Heat map of the top 100 OTUs (horizontal axis) broken down by dilution (vertical axis) and grouped using a complete linkage clustering. Note emergence of increasing numbers and diversity of low-abundance reads with increasing dilution. (C) Kernel density estimates for the intrareplicate (within-kit) and interreplicate (between-kit) Bray-Curtis distance for each dilution series. (D) Heat map showing the individual results from reagent controls with technical replicates (n = 3/sample). Samples are grouped using a complete linkage clustering. The DNA isolation kit is indicated by the color displayed to the left of the heatmap. (E) Graph depicting the interreplicate Bray-Curtis from technical replicates of reagent controls; bars are colored by kit as in panel D.
FIG 2
FIG 2
Relationship between the number of 16S rRNA gene copies in a sample and the reproducibility of the result. (A) Intrareplicate Bray-Curtis distance by sample type. Relationship between the number of bacterial 16S rRNA gene copies in a sample and the intrareplicate Bray-Curtis distance between replicates of respiratory specimens and their individual controls. (B) Mean (± SEM) number of 16S rRNA gene copies per sample by sample type. (C) Concentration of the number of bacterial 16S rRNA gene copies in a sample plotted against the intrareplicate Bray-Curtis distance between replicates of respiratory specimens and their individual controls. Each value is the mean ± SEM.
FIG 3
FIG 3
Comparison of the representation of the lung microbiome in EBC versus CBAL. (A) Principal component analysis (PCA) graph depicting CBAL samples (red) and scope prewash controls (blue). (B) PCA graph depicting EBC (red) and EBC controls (blue). (C) 3D scatterplot of CBAL sample OTU abundances, where each replicate is plotted on a separate axis. Common signals between each replicate should appear along the diagonal of the 3D box. Drop lines anchor the points to a position in the x-y plane, whereas color reflects higher abundances along the z axis. (D) 3D scatterplot of EBC sample OTU abundances, where each replicate is plotted on a separate axis. Drop lines anchor the points to a position in the x-y plane, whereas color reflects higher abundances along the z axis. (E) Rank abundance plots of the means of replicate medians of EBC (top) compared to CBAL fluid (bottom). Plots are ordered according to the mean abundances of the CBAL samples. Insets are the sample controls (EBC control and scope prewash control, respectively) ordered by mean abundances of CBAL samples. Bars show means of replicate medians ± SEM and are colored by the phylum of the OTU.

References

    1. Bassis CM, Erb-Downward JR, Dickson RP, Freeman CM, Schmidt TM, Young VB, Beck JM, Curtis JL, Huffnagle GB. 2015. Analysis of the upper respiratory tract microbiotas as the source of the lung and gastric microbiotas in healthy individuals. mBio 6:e00037. doi:10.1128/mBio.00037-15.
    1. Dickson RP, Erb-Downward JR, Freeman CM, McCloskey L, Falkowski NR, Huffnagle GB, Curtis JL. 2017. Bacterial topography of the healthy human lower respiratory tract. mBio 8:e02287-16. doi:10.1128/mBio.02287-16.
    1. Erb-Downward JR, Thompson DL, Han MK, Freeman CM, McCloskey L, Schmidt LA, Young VB, Toews GB, Curtis JL, Sundaram B, Martinez FJ, Huffnagle GB. 2011. Analysis of the lung microbiome in the “healthy” smoker and in COPD. PLoS One 6:e16384. doi:10.1371/journal.pone.0016384.
    1. Hilty M, Burke C, Pedro H, Cardenas P, Bush A, Bossley C, Davies J, Ervine A, Poulter L, Pachter L, Moffatt MF, Cookson WO. 2010. Disordered microbial communities in asthmatic airways. PLoS One 5:e8578. doi:10.1371/journal.pone.0008578.
    1. Parnell LA, Briggs CM, Cao B, Delannoy-Bruno O, Schrieffer AE, Mysorekar IU. 2017. Microbial communities in placentas from term normal pregnancy exhibit spatially variable profiles. Sci Rep 7:11200. doi:10.1038/s41598-017-11514-4.
    1. Prince AL, Ma J, Kannan PS, Alvarez M, Gisslen T, Harris RA, Sweeney EL, Knox CL, Lambers DS, Jobe AH, Chougnet CA, Kallapur SG, Aagaard KM. 2016. The placental membrane microbiome is altered among subjects with spontaneous preterm birth with and without chorioamnionitis. Am J Obstet Gynecol 214:627.e1–627.e16. doi:10.1016/j.ajog.2016.01.193.
    1. Zheng J, Xiao X, Zhang Q, Mao L, Yu M, Xu J, Wang T. 2017. The placental microbiota is altered among subjects with gestational diabetes mellitus: a pilot study. Front Physiol 8:675. doi:10.3389/fphys.2017.00675.
    1. Glendinning L, Wright S, Tennant P, Gill AC, Collie D, McLachlan G. 2017. Microbiota in exhaled breath condensate and the lung. Appl Environ Microbiol 83:e00515-17. doi:10.1128/AEM.00515-17.
    1. Zakharkina T, Koczulla AR, Mardanova O, Hattesohl A, Bals R. 2011. Detection of microorganisms in exhaled breath condensate during acute exacerbations of COPD. Respirology 16:932–938. doi:10.1111/j.1440-1843.2011.01977.x.
    1. Lauder AP, Roche AM, Sherrill-Mix S, Bailey A, Laughlin AL, Bittinger K, Leite R, Elovitz MA, Parry S, Bushman FD. 2016. Comparison of placenta samples with contamination controls does not provide evidence for a distinct placenta microbiota. Microbiome 4:29. doi:10.1186/s40168-016-0172-3.
    1. Leiby JS, McCormick K, Sherrill-Mix S, Clarke EL, Kessler LR, Taylor LJ, Hofstaedter CE, Roche AM, Mattei LM, Bittinger K, Elovitz MA, Leite R, Parry S, Bushman FD. 2018. Lack of detection of a human placenta microbiome in samples from preterm and term deliveries. Microbiome 6:196. doi:10.1186/s40168-018-0575-4.
    1. Theis KR, Romero R, Winters AD, Greenberg JM, Gomez-Lopez N, Alhousseini A, Bieda J, Maymon E, Pacora P, Fettweis JM, Buck GA, Jefferson KK, Strauss JF III, Erez O, Hassan SS. 2019. Does the human placenta delivered at term have a microbiota? Results of cultivation, quantitative real-time PCR, 16S rRNA gene sequencing, and metagenomics. Am J Obstet Gynecol 220:267.e1–267.e39. doi:10.1016/j.ajog.2018.10.018.
    1. St. George K, Fuschino ME, Mokhiber K, Triner W, Spivack SD. 2010. Exhaled breath condensate appears to be an unsuitable specimen type for the detection of influenza viruses with nucleic acid-based methods. J Virol Methods 163:144–146. doi:10.1016/j.jviromet.2009.08.019.
    1. Vogelberg C, Hirsch T, Rosen-Wolff A, Kerkmann ML, Leupold W. 2003. Pseudomonas aeruginosa and Burkholderia cepacia cannot be detected by PCR in the breath condensate of patients with cystic fibrosis. Pediatr Pulmonol 36:348–352. doi:10.1002/ppul.10352.
    1. Salter SJ, Cox MJ, Turek EM, Calus ST, Cookson WO, Moffatt MF, Turner P, Parkhill J, Loman NJ, Walker AW. 2014. Reagent and laboratory contamination can critically impact sequence-based microbiome analyses. BMC Biol 12:87. doi:10.1186/s12915-014-0087-z.
    1. Costello M, Fleharty M, Abreu J, Farjoun Y, Ferriera S, Holmes L, Granger B, Green L, Howd T, Mason T, Vicente G, Dasilva M, Brodeur W, DeSmet T, Dodge S, Lennon NJ, Gabriel S. 2018. Characterization and remediation of sample index swaps by non-redundant dual indexing on massively parallel sequencing platforms. BMC Genomics 19:332. doi:10.1186/s12864-018-4703-0.
    1. MacConaill LE, Burns RT, Nag A, Coleman HA, Slevin MK, Giorda K, Light M, Lai K, Jarosz M, McNeill MS, Ducar MD, Meyerson M, Thorner AR. 2018. Unique, dual-indexed sequencing adapters with UMIs effectively eliminate index cross-talk and significantly improve sensitivity of massively parallel sequencing. BMC Genomics 19:30. doi:10.1186/s12864-017-4428-5.
    1. Minich JJ, Sanders JG, Amir A, Humphrey G, Gilbert JA, Knight R. 2019. Quantifying and understanding well-to-well contamination in microbiome research. mSystems 4:e00186-19. doi:10.1128/mSystems.00186-19.
    1. Dickson RP, Erb-Downward JR, Freeman CM, Walker N, Scales BS, Beck JM, Martinez FJ, Curtis JL, Lama VN, Huffnagle GB. 2014. Changes in the lung microbiome following lung transplantation Include the emergence of two distinct Pseudomonas species with distinct clinical associations. PLoS One 9:e97214. doi:10.1371/journal.pone.0097214.
    1. Prosser JI. 2010. Replicate or lie. Environ Microbiol 12:1806–1810. doi:10.1111/j.1462-2920.2010.02201.x.
    1. de Goffau MC, Lager S, Sovio U, Gaccioli F, Cook E, Peacock SJ, Parkhill J, Charnock-Jones DS, Smith G. 2019. Human placenta has no microbiome but can contain potential pathogens. Nature 572:329–334. doi:10.1038/s41586-019-1451-5.
    1. Sze MA, Schloss PD. 2019. The impact of DNA polymerase and number of rounds of amplification in PCR on 16S rRNA gene sequence data. mSphere 4:e00163-19. doi:10.1128/mSphere.00163-19.
    1. Akbari M, Hansen MD, Halgunset J, Skorpen F, Krokan HE. 2005. Low copy number DNA template can render polymerase chain reaction error prone in a sequence-dependent manner. J Mol Diagn 7:36–39. doi:10.1016/S1525-1578(10)60006-2.
    1. Horvath I, Hunt J, Barnes PJ, Alving K, Antczak A, Baraldi E, Becher G, van Beurden WJ, Corradi M, Dekhuijzen R, Dweik RA, Dwyer T, Effros R, Erzurum S, Gaston B, Gessner C, Greening A, Ho LP, Hohlfeld J, Jobsis Q, Laskowski D, Loukides S, Marlin D, Montuschi P, Olin AC, Redington AE, Reinhold P, van Rensen EL, Rubinstein I, Silkoff P, Toren K, Vass G, Vogelberg C, Wirtz H, ATS/ERS Task Force on Exhaled Breath Condensate. 2005. Exhaled breath condensate: methodological recommendations and unresolved questions. Eur Respir J 26:523–548. doi:10.1183/09031936.05.00029705.
    1. Morris A, Lung HIV Microbiome Project, Beck JM, Schloss PD, Campbell TB, Crothers K, Curtis JL, Flores SC, Fontenot AP, Ghedin E, Huang L, Jablonski K, Kleerup E, Lynch SV, Sodergren E, Twigg H, Young VB, Bassis CM, Venkataraman A, Schmidt TM, Weinstock GM. 2013. Comparison of the respiratory microbiome in healthy nonsmokers and smokers. Am J Respir Crit Care Med 187:1067–1075. doi:10.1164/rccm.201210-1913OC.
    1. Dickson RP, Erb-Downward JR, Freeman CM, McCloskey L, Beck JM, Huffnagle GB, Curtis JL. 2015. Spatial variation in the healthy human lung microbiome and the adapted island model of lung biogeography. Annals ATS 12:821–830. doi:10.1513/AnnalsATS.201501-029OC.
    1. Kozich JJ, Westcott SL, Baxter NT, Highlander SK, Schloss PD. 2013. Development of a dual-index sequencing strategy and curation pipeline for analyzing amplicon sequence data on the MiSeq Illumina sequencing platform. Appl Environ Microbiol 79:5112–5120. doi:10.1128/AEM.01043-13.
    1. Dickson RP, Erb-Downward JR, Falkowski NR, Hunter EM, Ashley SL, Huffnagle GB. 2018. The lung microbiota of healthy mice are highly variable, cluster by environment, and reflect variation in baseline lung innate immunity. Am J Respir Crit Care Med 198:497–508. doi:10.1164/rccm.201711-2180OC.
    1. Oksanen J, Blanchet FG, Friendly M, Kindt R, Legendre P, McGlinn D, Minchin P, O’Hara RB, Simpson GL, Solymos P, Stevens MHH, Szoecs E, Wagner H. 2018. vegan: community ecology package. R package version 2.5–3. .
    1. Gu Z, Eils R, Schlesner M. 2016. Complex heatmaps reveal patterns and correlations in multidimensional genomic data. Bioinformatics 32:2847–2849. doi:10.1093/bioinformatics/btw313.
    1. Wickham H. 2016. ggplot2: elegant graphics for data analysis. Springer, New York, NY.
    1. Wickham H, François R, Henry L, Müller K. 2018. dplyr: a grammar of data manipulation. vR package version 0.7.6. .
    1. Henry H, Henry L. 2018. tidyr: easily tidy data with ‘spread()’ and ‘gather()’ functions. vR package version 0.8.1. .
    1. Wang Y, Naumann U, Eddelbuettel D, Wilshire J, Warton D. 2019. mvabund: statistical methods for analysing multivariate abundance data. vR package version 4.0.1. .
    1. Neuwirth E. 2014. RColorBrewer: ColorBrewer palettes. vR package version 1.1–2. .

Source: PubMed

3
Abonner