Analyses of the microbial diversity across the human microbiome

Kelvin Li, Monika Bihan, Shibu Yooseph, Barbara A Methé, Kelvin Li, Monika Bihan, Shibu Yooseph, Barbara A Methé

Abstract

Analysis of human body microbial diversity is fundamental to understanding community structure, biology and ecology. The National Institutes of Health Human Microbiome Project (HMP) has provided an unprecedented opportunity to examine microbial diversity within and across body habitats and individuals through pyrosequencing-based profiling of 16 S rRNA gene sequences (16 S) from habits of the oral, skin, distal gut, and vaginal body regions from over 200 healthy individuals enabling the application of statistical techniques. In this study, two approaches were applied to elucidate the nature and extent of human microbiome diversity. First, bootstrap and parametric curve fitting techniques were evaluated to estimate the maximum number of unique taxa, S(max), and taxa discovery rate for habitats across individuals. Next, our results demonstrated that the variation of diversity within low abundant taxa across habitats and individuals was not sufficiently quantified with standard ecological diversity indices. This impact from low abundant taxa motivated us to introduce a novel rank-based diversity measure, the Tail statistic, ("τ"), based on the standard deviation of the rank abundance curve if made symmetric by reflection around the most abundant taxon. Due to τ's greater sensitivity to low abundant taxa, its application to diversity estimation of taxonomic units using taxonomic dependent and independent methods revealed a greater range of values recovered between individuals versus body habitats, and different patterns of diversity within habitats. The greatest range of τ values within and across individuals was found in stool, which also exhibited the most undiscovered taxa. Oral and skin habitats revealed variable diversity patterns, while vaginal habitats were consistently the least diverse. Collectively, these results demonstrate the importance, and motivate the introduction, of several visualization and analysis methods tuned specifically for next-generation sequence data, further revealing that low abundant taxa serve as an important reservoir of genetic diversity in the human microbiome.

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Figure 1. Contribution of taxa to diversity.
Figure 1. Contribution of taxa to diversity.
A theoretical rank abundance curve (a PDF) is overlayed with its CDF (black) as a “Pareto chart”. The overlaid colored lines represent each diversity index as lower abundant taxonomic units are included. For example at “c”, the height of each curve represents the relative value of the index if the sample were only composed of a, b, and c. The more quickly an index curve reaches it maximum normalized value of 1.0, the less the index is capable of resolving low abundance taxonomic units. From the graph, it can be observed that the Shannon and Simpson diversity indices approach their saturation point more quickly than the Tail statistic or a Renyi entropy with a fractional alpha.
Figure 2. Body habitats ordered by diversity…
Figure 2. Body habitats ordered by diversity measure.
Body regions are color coded, Oral-black, Skin-red, Vaginal-green, and Stool-blue. Subfigures a, b, and c, were computed on genera-based taxonomic units. Subfigures d, e, and f, were computed on OTU-based taxonomic units.
Figure 3. Comparison of diversity indices for…
Figure 3. Comparison of diversity indices for median versus pooled taxonomic profiles.
Simple regression lines were drawn in solid black for each median individual versus pooled samples scatterplot. The dashed blue lines (slope  = 1, y-intercept  = 0) represent where a hypothetical (median  =  pooled) relationship would exist if all individuals had identical taxonomic profiles. Both the OTU-based and genera-based comparisons using the Shannon diversity index indicate only a slight and almost constant elevation of the diversity between the median individual and pooled samples. However, τ is able to capture the lengthening tail attributed to the low abundant taxa that are exclusive to certain individuals. See Table S2 for a mapping of abbreviations to habitat names. Green, red, black and blue points represent vaginal, skin, oral, and stool body regions, respectively.
Figure 4. Low abundance, high ubiquity taxa.
Figure 4. Low abundance, high ubiquity taxa.
This figure helps the observer to comprehend the relationship between abundance and ubiquity when defining a core microbiome. As one would expect, increasing the abundance threshold for defining whether a sample contains a particular taxon would reduce the percentage of samples (ubiquity) that would contain it. The lines that are presented refer to all taxa in the stool samples that are in more than 97.5% of the samples with an abundance cutoff of 0.05%. The taxon Bacteroides (red) is both relatively highly abundant and highly ubiquitous, so its fall off is less steep than the Clostridales shown.
Figure 5. OTU-to-Genera ratios.
Figure 5. OTU-to-Genera ratios.
The median ratio of OTUs to Genera was calculated and plotted from greatest to least for each body habitat. These medians and 95% confidence intervals were estimated with bootstrapping by resampling from the combined distribution of OTUs and Genera to a common read depth. The common read depth chosen was the body habitat with the least read coverage, left antecubital fossa.
Figure 6. Comparison of all and “common”…
Figure 6. Comparison of all and “common” taxonomic units and their effect on the Shannon and τ statistics.
For both genera-based and OTU-based taxonomic units, the Shannon diversity index and τ were compared against the median estimated Smax on all (blue) and common (green) taxonomic units. Each point in the scatterplot represents one of the 18 body habitats. There is a closer relationship between τ and Smax than for the Shannon diversity index, for both genera and OTU based profiles. The red line represents a simple regression line across all points.
Figure 7. Dominance Profiles.
Figure 7. Dominance Profiles.
These stacked bar plots help to compare the low abundant taxonomic units, which may be difficult to visualize with rank abundance curves alone. The number of taxonomic units for each body region is represented by the height of each bar plot. The proportions that are colored represent the relative logarithm of abundance with the color key on the left. The subpanels, a and b, represent genera and OTUs, respectively.
Figure 8. Relationship between the Tail statistic,…
Figure 8. Relationship between the Tail statistic, τ, and Standard Deviation, σ.
τ is the standard deviation of the rank abundance curve after reflection around the most dominant taxonomic unit, i  = 1. The blue bars represent the rank abundance curve. Above each bar, the probability, Pr[i], of the ith most dominant taxonomic unit has been labelled in italics. The natural numbers labelled in bold above the blue bars represent the rank, i, of each taxonomic unit. The name of each taxonomic unit is labelled along the x-axis. The grey bars represent the mirror image of the rank abundance curve. Treating i  = 1of the symmetric distribution as μ  = 1, the standard deviation, σ, is then 3.764, which also represents τ for this rank abundance curve and sample.

References

    1. Goodman AL, Gordon JI. Our unindicted coconspirators: human metabolism from a microbial perspective. Cell Metab. 2010;12(2):111–6.
    1. Bäckhed F, Ding H, Wang T, Hooper LV, Koh GY, et al. The gut microbiota as an environmental factor that regulates fat storage. Proc Natl Acad Sci U S A. 2004;101(44):15718–23.
    1. Ordovas JM, Mooser V. Metagenomics: the role of the microbiome in cardiovascular diseases. Curr Opin Lipidol. 2006;17(2):157–61.
    1. Belda-Ferre P, Alcaraz LD, Cabrera-Rubio R, Romero H, Simón-Soro A, et al. The oral metagenome in health and disease. ISME J. 2011. doi:10.1038/ismej.2011.85.
    1. Kau AL, Ahern PP, Griffin NW, Goodman AL, Gordon JI. Human nutrition, the gut microbiome and the immune system. Nature. 2011;474(7351):327–36. doi:10.1038/nature10213.
    1. Mirmonsef P, Gilbert D, Zariffard MR, Hamaker BR, Kaur A, et al. The effects of commensal bacteria on innate immune responses in the female genital tract. Am J Reprod Immunol. 2011;65(3):190–5. doi: 10.1111/j.1600–0897.2010.00943.x.
    1. NIH HMP Working Group, Peterson J, Garges S, Giovanni M, McInnes P, et al. The NIH Human Microbiome Project. Genome Res. 2009;19(12):2317–23.
    1. Rappe MS, Giovannoni SJ. The uncultured microbial majority. Annu. Rev. Microbiol. 2003;57:369–394.
    1. Sogin ML, Morrison HG, Huber JA, Mark Welch D, Huse SM, et al. Microbial diversity in the deep sea and the underexplored “rare biosphere”. Proc Natl Acad Sci U S. 2006;A(103(32)):12115–20.
    1. Campbell BJ, Polson SW, Hanson TE, Mack MC, Schuur EA. The effect of nutrient deposition on bacterial communities in Arctic tundra soil. Environ Microbiol. 2010;12(7):1842–54.
    1. Fortunato CS, Crump BC. Bacterioplankton community variation across river to ocean environmental gradients. Microb Ecol. 2011 Aug;62(2): 374–82. Epub 2011 Feb 1. 2011.
    1. Gottel NR, Castro HF, Kerley M, Yang Z, Pelletier DA, et al. Distinct Microbial Communities within the Endosphere and Rhizosphere of Populus deltoides Roots across Contrasting Soil Types. Appl Environ Microbiol. 2011;77(17):5934–44.
    1. Ramos-Padrón E, Bordenave S, Lin S, Bhaskar IM, Dong X, et al. Carbon and sulfur cycling by microbial communities in a gypsum-treated oil sands tailings pond. Environ Sci. 2011;Technol(45(2)):439–46.
    1. Aas JA, Paster BJ, Stokes LN, Olsen I, Dewhirst FE. Defining the normal bacterial flora of the oral cavity. J Clin Microbiol. 2005;43(11):5721–32.
    1. Costello EK, Lauber CL, Hamady M, Fierer N, Gordon JI, et al. Bacterial community variation in human body habitats across space and time. Science. 2009;326(5960):1694–7.
    1. Crielaard W, Zaura E, Schuller AA, Huse SM, Montijn RC, et al. Exploring the oral microbiota of children at various developmental stages of their dentition in the relation to their oral health. BMC Med Genomics. 4. 2011;4:22.
    1. Nasidze I, Li J, Schroeder R, Creasey JL, Li M, et al. High diversity of the saliva microbiome in batwa pygmies. PLoS One. 2011;6(8):e23352. Epub 2011 Aug 16. 2011.
    1. Gilbert JA, Steele JA, Caporaso JG, Steinbrück L, Reeder J, et al. Defining seasonal marine microbial community dynamics. ISME J. 2011. doi:10.1038/ismej.2011.107.
    1. Röling WF, Ferrer M, Golyshin PN. Systems approaches to microbial communities and their functioning. Curr Opin Biotechnol. 2010;21(4):532–8.
    1. Morales SE, Cosart TF, Johnson JV, Holben WE. Extensive phylogenetic analysis of a soil bacterial community illustrates extreme taxon evenness and the effects of amplicon length, degree of coverage, and DNA fractionation on classification and ecological parameters. Appl Environ Microbiol. 2009;75(3):668–75.
    1. Schloss PD, Westcott SL. Assessing and improving methods used in operational taxonomic unit-based approaches for 16S rRNA gene sequence analysis. Appl Environ Microbiol. 2011;77(10):3219–26.
    1. Wang Q, Garrity GM, Tiedje JM, Cole JR. Naive Bayesian classifier for rapid assignment of rRNA sequences into the new bacterial taxonomy. Appl Environ Microbiol. 2007;73(16):5261–7.
    1. Clayton RA, Sutton G, Hinkle PS, Jr, Bult C, Fields C. Intraspecific variation in small-subunit rRNA sequences in GenBank: why single sequences may not adequately represent prokaryotic taxa. Int J Syst Bacteriol. 1995;45(3):595–9.
    1. Cilia V, Lafay B, Christen R. Sequence heterogeneities among 16S ribosomal RNA sequences, and their effect on phylogenetic analyses at the species level. Mol Biol Evol. 1996;13(3):451–61.
    1. Lozupone CA, Knight R. Species divergence and the measurement of microbial diversity. FEMS Microbiol Rev. 2008;32(4):557–78.
    1. Myshrall KL, Mobberley JM, Green SJ, Visscher PT, Havemann SA, et al. Biogeochemical cycling and microbial diversity in the modern marine thrombolites of Highborne Cay, Bahamas. Geobiology. 2010;8:337–354.
    1. Bunge J. Estimating the number of species with catchall. Pac Symp Biocomput. 2011;2011:121–30.
    1. Bulmer MG. On Fitting the Poisson Lognormal Distribution to Species-Abundance Data, Biometrics. 1974;30:101–110.
    1. Tamhane AC, Dunlap DD. Statistics and Data Analysis from Elementary to Intermediate, Prentice Hall. 2000.
    1. Gumbel EJ. Statistics of Extremes, Dover Publications. 1958.
    1. Hill MO. Diversity and Evenness: A unifying notation and its consequences, Ecology. 1973;54(2):427–432.
    1. Renyi A. On Measures of Entropy and Information, Fourth Berkeley Symposium. Math. Statistic. And Prob. 1961;1:547–561.
    1. Wasserman L. All of Statistics: A Concise Course in Statistical Inference. Springer. 2004.
    1. Caporaso JG, Lauber CL, Costello EK, Berg-Lyons D, Gonzalez A, et al. Moving pictures of the human microbiome. Genome. 2011;Biol(12(5)):R50.
    1. Good IJ, Toulmin GH. The Number of New Species, and the Increase in Population Coverage, when a Sample is Increased. Biometrica Trust. 1956;43(1/2):45–63.
    1. Mager DL, Ximenez-Fyvie LA, Haffajee AD, Socransky SS. Distribution of selected bacterial species on intraoral surfaces. J. Clin. Periodontol. 2003;30:644–654.
    1. DeSantis TZ, Hugenholtz P, Keller K, Brodie EL, Larsen N, et al. NAST: a multiple sequence alignment server for comparative analysis of 16S rRNA genes. Nucl. Acids Res. 2006;34:W394–W399.
    1. Pruesse E, Quast C, Knittel K, Fuchs BM, Ludwig W, et al. SILVA: a comprehensive online resource for quality checked and aligned ribosomal RNA sequence data compatible with ARB. Nucl. Acids Res. 2007;35(21):7188–7196.

Source: PubMed

3
Suscribir