The convergence of carbohydrate active gene repertoires in human gut microbes

Catherine A Lozupone, Micah Hamady, Brandi L Cantarel, Pedro M Coutinho, Bernard Henrissat, Jeffrey I Gordon, Rob Knight, Catherine A Lozupone, Micah Hamady, Brandi L Cantarel, Pedro M Coutinho, Bernard Henrissat, Jeffrey I Gordon, Rob Knight

Abstract

The extreme variation in gene content among phylogenetically related microorganisms suggests that gene acquisition, expansion, and loss are important evolutionary forces for adaptation to new environments. Accordingly, phylogenetically disparate organisms that share a habitat may converge in gene content as they adapt to confront shared challenges. This response should be especially pronounced for functional genes that are important for survival in a particular habitat. We illustrate this principle by showing that the repertoires of two different types of carbohydrate-active enzymes, glycoside hydrolases and glycosyltransferases, have converged in bacteria and archaea that live in the human gut and that this convergence is largely due to horizontal gene transfer rather than gene family expansion. We also identify gut microbes that may have more similar dietary niches in the human gut than would be expected based on phylogeny. The techniques used to obtain these results should be broadly applicable to understanding the functional genes and evolutionary processes important for adaptation in many environments and useful for interpreting the large number of reference microbial genome sequences being generated for the International Human Microbiome Project.

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Fig. 1.
Fig. 1.
Methodologic approaches. Schematic of genomic UniFrac that can be applied to a “forest” of trees. 1) First we use NJ to generate a phylogenetic tree for each gene family. In this example, we are comparing three genomes that are colored red, blue, and yellow using two gene families. 2) Trees are joined by addition to the same root with a branch length of zero. The trees are first normalized by dividing the branches by the maximum root to tip distance to correct for differential rates of evolution in the different gene families. 3) Pairwise UniFrac distances are calculated between all possible combinations of genomes using both unweighted and weighted UniFrac. For each pair, all sequences not from either genome are first removed. Unweighted UniFrac distances are the fraction of branch length that leads to one genome or the other (yellow and blue branches) but not both (gray branches). Paralogous genes (circled in red) do not heavily affect the results because they introduce little unique branch length. Weighted UniFrac weights each branch by the differential representation of its descendants in the two genomes (represented by line thickness; gray branches carry no weight). The blue genome will look more different from the yellow because of the paralogs. 4) The final genome cluster is made by applying the NJ algorithm to the UniFrac distance matrix. It takes only one switch between Non-gut (N) and Gut (G) to describe the distribution of states on the tree. 5) Determining if genome clusters group gut genomes together better than phylogeny. If the ancestral state was N, it would require more changes to explain the distribution of states in the 16S rRNA phylogenetic tree than in the genome cluster, suggesting convergence.
Fig. 2.
Fig. 2.
Clustering of the 67 gut and non-gut associated microbes included in this study. (A) 16S rRNA-based phylogenetic tree that is the majority rule consensus of 1,000 bootstrapped NJ trees (see SI Methods). Gut microorganisms are highlighted in red. Sequenced genomes from the HGMI are marked with an asterisk. Red dots denote the 13 internal nodes where Fitch parsimony counted a gut/non-gut switch (see Fig. 1). Higher-level taxonomic categories are noted with both text and shading. (B) The GT weighted UniFrac cluster. Higher-level taxonomic categories are shaded as in A and gut organisms are colored red. The red box highlights interdivision clustering between gut Actinobacteria and Firmicutes. Red arrows show where related, non-gut organisms (non-gut Actinobacteria and C. thermocellum) cluster instead. (C) The GH unweighted UniFrac cluster. Shading and text colors are as described for B. The red boxes and arrows highlight habitat related clustering (the gut Mollicute E. dolichum clusters with a gut Actinobacteria instead of with its relative Acholeplasma laidlawii and the non-gut Lactobacillus brevis clusters with O. oeni instead of its relative from the gut L. salivarus). (D) The GH weighted count cluster. Gut organisms are in red text and members of the Bacteroidetes are highlighted in blue.

Source: PubMed

3
Iratkozz fel