Use of direct gradient analysis to uncover biological hypotheses in 16s survey data and beyond

John R Erb-Downward, Amir A Sadighi Akha, Juan Wang, Ning Shen, Bei He, Fernando J Martinez, Margaret R Gyetko, Jeffrey L Curtis, Gary B Huffnagle, John R Erb-Downward, Amir A Sadighi Akha, Juan Wang, Ning Shen, Bei He, Fernando J Martinez, Margaret R Gyetko, Jeffrey L Curtis, Gary B Huffnagle

Abstract

This study investigated the use of direct gradient analysis of bacterial 16S pyrosequencing surveys to identify relevant bacterial community signals in the midst of a "noisy" background, and to facilitate hypothesis-testing both within and beyond the realm of ecological surveys. The results, utilizing 3 different real world data sets, demonstrate the utility of adding direct gradient analysis to any analysis that draws conclusions from indirect methods such as Principal Component Analysis (PCA) and Principal Coordinates Analysis (PCoA). Direct gradient analysis produces testable models, and can identify significant patterns in the midst of noisy data. Additionally, we demonstrate that direct gradient analysis can be used with other kinds of multivariate data sets, such as flow cytometric data, to identify differentially expressed populations. The results of this study demonstrate the utility of direct gradient analysis in microbial ecology and in other areas of research where large multivariate data sets are involved.

Figures

Figure 1. Unconstrained ordination of lung explant…
Figure 1. Unconstrained ordination of lung explant OTUs at phylotype resolution by subject.
The bacterial communities of human lung explants were binned at phylotype resolution and were ordinated using PCoA based on (A) Bray-Curtis distance, (B) Jaccard distance, (C) PCA and (D) CA. Individual samples were colored based on the explant from which they were obtained, regardless of anatomic location.
Figure 2. Unconstrained ordination of lung explant…
Figure 2. Unconstrained ordination of lung explant OTUs at phylotype resolution by anatomic location.
The bacterial communities of human lung explants were binned at phylotype resolution and ordinated using PCoA based on (A) Bray-Curtis distance, (B) Jaccard distance, (C) PCA and (D) CA. Individual samples were colored based on the location within the lung from which the sample was obtained, regardless of subject.
Figure 3. Unconstrained ordination of lung explant…
Figure 3. Unconstrained ordination of lung explant OTUs at a 3% OTU cutoff by subject.
The bacterial communities of human lung explants were binned at a 3% OTU cutoff to produce high-resolution data. The data were ordinated using PCoA based on (A) Bray-Curtis distance, (B) Jaccard distance, (C) PCA and (D) CA. Individual samples were colored based on the explant from which they were obtained, regardless of anatomic location.
Figure 4. Unconstrained ordination of lung explant…
Figure 4. Unconstrained ordination of lung explant OTUs at 3% OTU cutoff by anatomic location.
The bacterial communities of human lung explants were binned at a 3% OTU cutoff to produce high-resolution data. The data were ordinated using PCoA based on (A) Bray-Curtis distance, (B) Jaccard distance, (C) PCA and (D) CA. Individual samples were colored based on the anatomic location within the lung from which the sample was obtained, regardless of subject.
Figure 5. Direct gradient analysis of lung…
Figure 5. Direct gradient analysis of lung explant OTUs at phylotype resolution constrained by subject.
The bacterial communities of human lung explants were binned at a phylotype resolution and constrained ordinations were constructed using the explant from which the sample originated as the constraint. Ordinations were CAP (constrained analysis of principle coordinates) based on (A) Bray-Curtis distance, or (B) Jaccard distance; (C) RDA (the constrained form of PCA); (D) or CCA (the constrained form of CA). Individual samples were colored based on the explant from which they were obtained.
Figure 6. Direct gradient analysis of lung…
Figure 6. Direct gradient analysis of lung explant OTUs at 3% OTU cutoff constrained by subject.
The bacterial communities of COPD lung explants were binned at high-resolution, and constrained ordinations were constructed using the explant from which the sample originated at the constraint. Ordinations were CAP ordinations based on (A) Bray-Curtis distance, or (B) Jaccard distance; (C) RDA; (D) CCA. Individual samples were colored based on the explant from which they were obtained.
Figure 7. Direct gradient analysis of lung…
Figure 7. Direct gradient analysis of lung explant OTUs at phylotype resolution constrained by anatomic location.
The bacterial communities of COPD lung explants were binned at a phylotype resolution and constrained ordinations were constructed based on the anatomic lung location from which the sample originated. Ordinations were CAP ordinations based on (A) Bray-Curtis distance, or (B) Jaccard distance; (C) RDA; (D) CCA. Individual samples were colored based on the location in the lung from which they were obtained.
Figure 8. Direct gradient analysis of human…
Figure 8. Direct gradient analysis of human lung explant OTUs at high-resolution constrained by anatomic location.
The bacterial communities of COPD lung explants were binned at high-resolution, and constrained ordinations were constructed using the location in the lung from which the sample originated at the constraint. Ordinations were CAP ordinations based on (A) Bray-Curtis distance, or (B) Jaccard distance; (C) RDA; (D) CCA. Individual samples were colored based on the anatomic location in the lung from which they were obtained.
Figure 9. Indirect gradient analysis of cecal…
Figure 9. Indirect gradient analysis of cecal bacterial communities of mice undergoing gut microbiome modification.
Bacterial OTUs were binned at a 3% OTU cutoff and ordinations were plotted based on (A, B) PCoA using (A) Bray-Curtis distance or (B) Jaccard distance; (C) PCA ; and (D) CA. Individual samples were colored based on the treatment regime the mice received.
Figure 10. Direct gradient analysis of the…
Figure 10. Direct gradient analysis of the cecal bacterial communities of mice constrained by treatment.
Bacterial OTUs were binned at a 3% OTU cutoff and the data were constrained by treatment regime. Constrained ordinations were plotted based on (A, B) PCoA using (A) Bray-Curtis distance or (B) Jaccard distance; (C) PCA; and (D) CA Individual samples were colored based on the treatment regime the mice received.
Figure 11. Direct gradient analysis of CD45+…
Figure 11. Direct gradient analysis of CD45+ interstitial epithelial leukocytes during C. difficile infection.
A. CD45+ IEL from untreated mice and infected mice were constrained by infection status using CCA and plotted using rgl. Blue, cells from untreated mice; red, cells from infected mice. B. Boxplots characterizing relative fluorescent intensities of the cells in the region indicated in panel A by the white ellipse. The vertical axis depicts cell frequency, relative to all CD45+ events; FSC, forward scatter; SSC, side scatter. Figure 11C depicts the frequency of cells in both the uninfected and infected state (as a % of CD45+ cells ) within the selected region.

References

    1. Liu Z., DeSantis T. Z., Andersen G. L. & Knight R. Accurate taxonomy assignments from 16S rRNA sequences produced by highly parallel pyrosequencers. Nucleic Acids Res 36, e120 (2008)
    1. Schmidt T. M. & Relman D. A. Phylogenetic identification of uncultured pathogens using ribosomal RNA sequences. Methods Enzymol 235, 205–222 (1994).
    1. Rothberg J. M. & Leamon J. H. The development and impact of 454 sequencing. Nat Biotechnol 26, 1117–1124 (2008).
    1. Mason K. L., Erb Downward J. R., Falkowski N. R., Young V. B., Kao J. Y. et al. Interplay between the Gastric Bacterial Microbiota and Candida albicans during Postantibiotic Recolonization and Gastritis. Infect Immun 80, 150–158 (2012).
    1. Ramette A. Multivariate analyses in microbial ecology. FEMS Microbiol Ecol 62, 142–160 (2007).
    1. Turnbaugh P. J. & Gordon J. I. The core gut microbiome, energy balance and obesity. J Physiol 587, 4153–4158 (2009).
    1. Turnbaugh P. J., Ley R. E., Mahowald M. A., Magrini V., Mardis E. R. et al. An obesity-associated gut microbiome with increased capacity for energy harvest. Nature 444, 1027–1031 (2006).
    1. Muegge B. D., Kuczynski J., Knights D., Clemente J. C., Gonzalez A. et al. Diet drives convergence in gut microbiome functions across mammalian phylogeny and within humans. Science 332, 970–974 (2011).
    1. Caporaso J. G., Kuczynski J., Stombaugh J., Bittinger K., Bushman F. D. et al. QIIME allows analysis of high-throughput community sequencing data. Nat Methods 7, 335–336 (2010).
    1. Lozupone C. & Knight R. UniFrac: a new phylogenetic method for comparing microbial communities. Appl Environ Microbiol 71, 8228–8235 (2005).
    1. Schloss P. D., Westcott S. L., Ryabin T., Hall J. R., Hartmann M. et al. Introducing mothur: open-source, platform-independent, community-supported software for describing and comparing microbial communities. Appl Environ Microbiol 75, 7537–7541 (2009).
    1. Sun Y., Cai Y., Mai V., Farmerie W., Yu F. et al. Advanced computational algorithms for microbial community analysis using massive 16S rRNA sequence data. Nucleic Acids Res 38, e205 (2010).
    1. Wang Q., Garrity G. M., Tiedje J. M. & Cole J. R. Naive Bayesian classifier for rapid assignment of rRNA sequences into the new bacterial taxonomy. Appl Environ Microbiol 73, 5261–5267 (2007).
    1. Minchin P. R. An evaluation of relative robustness of techniques for ecological ordinations. Vegetatio 69, 89–107 (1987).
    1. Legendre P. & Legendre L. Numerical Ecology. 853. (1998).
    1. Erb-Downward J. R., Thompson D. L., Han M. K., Freeman C. M., McCloskey L. et al. Analysis of the lung microbiome in the "healthy" smoker and in COPD. PLoS One 6, e16384 (2011).
    1. Ter Braak C. J. F. Canonical Correspondence Analysis: a new eigenvector technique for multivariate direct gradient analysis. Ecology 67, 1167–1179 (1986).
    1. Jari Oksanen F., Guillaume Blanchet, Roeland Kindt, Pierre Legendre, R. B. O'Hara, et al.vegan: Community Ecology Package. R package version 1.17-3. 1.17-3 ed. (2010).
    1. Noverr M. C., Noggle R. M., Toews G. B. & Huffnagle G. B. Role of antibiotics and fungal microbiota in driving pulmonary allergic responses. Infect Immun 72, 4996–5003 (2004).
    1. Mason K. L., Erb Downward J. R., Mason K. D., Falkowski N. R., Eaton K. A. et al. Candida albicans and Bacterial Microbiota Interactions in the Cecum during Recolonization following Broad-Spectrum Antibiotic Therapy. Infect Immun 80, 3371–3380 (2012).
    1. Clarke K. R. Non-parametric multivariate analyses of changes in community structure. Australian Journal of Ecology 18, 117–143 (1993).
    1. Peres-Neto P. & Jackson D. How well do multivariate data sets match? The advantages of a Procrustean superimposition approach over the Mantel test. Oecologia 129, 169–178 (2001).
    1. Mardia K. V., Kent J. T. & Bibby J. M. Multivariate analysis: Academic Press. (1979).
    1. Zapala M. A. & Schork N. J. Multivariate regression analysis of distance matrices for testing associations between gene expression patterns and related variables. Proc Natl Acad Sci USA 103, 19430–19435 (2006).
    1. Kuczynski J., Liu Z., Lozupone C., McDonald D., Fierer N. et al. Microbial community resemblance methods differ in their ability to detect biologically relevant patterns. Nat Methods 7, 813–819 (2010).
    1. Kosugi Y., Sato R., Genka S., Shitara N. & Takakura K. An interactive multivariate analysis of FCM data. Cytometry 9, 405–408 (1988).
    1. De Zen L., Bicciato S., te Kronnie G. & Basso G. Computational analysis of flow-cytometry antigen expression profiles in childhood acute lymphoblastic leukemia: an MLL/AF4 identification. Leukemia 17, 1557–1565 (2003).
    1. Lugli E., Pinti M., Nasi M., Troiano L., Ferraresi R. et al. Subject classification obtained by cluster analysis and principal component analysis applied to flow cytometric data. Cytometry Part A 71A, 334–344 (2007).
    1. Noverr M. C., Falkowski N. R., McDonald R. A., McKenzie A. N. & Huffnagle G. B. Development of allergic airway disease in mice following antibiotic therapy and fungal microbiota increase: role of host genetics, antigen, and interleukin-13. Infect Immun 73, 30–38 (2005).
    1. Hasegawa M., Yamazaki T., Kamada N., Tawaratsumida K., Kim Y. G. et al. Nucleotide-binding oligomerization domain 1 mediates recognition of Clostridium difficile and induces neutrophil recruitment and protection against the pathogen. J Immunol 186, 4872–4880 (2011).
    1. Hahne F., LeMeur N., Brinkman R. R., Ellis B., Haaland P. et al. flowCore: a Bioconductor package for high throughput flow cytometry. BMC Bioinformatics 10, 106 (2009).
    1. Murdoch D. A. a. D. rgl: 3D visualization device system (OpenGL). R package version 0.92.798 ed (2011).

Source: PubMed

3
Subskrybuj