A human gut microbial gene catalogue established by metagenomic sequencing

Junjie Qin, Ruiqiang Li, Jeroen Raes, Manimozhiyan Arumugam, Kristoffer Solvsten Burgdorf, Chaysavanh Manichanh, Trine Nielsen, Nicolas Pons, Florence Levenez, Takuji Yamada, Daniel R Mende, Junhua Li, Junming Xu, Shaochuan Li, Dongfang Li, Jianjun Cao, Bo Wang, Huiqing Liang, Huisong Zheng, Yinlong Xie, Julien Tap, Patricia Lepage, Marcelo Bertalan, Jean-Michel Batto, Torben Hansen, Denis Le Paslier, Allan Linneberg, H Bjørn Nielsen, Eric Pelletier, Pierre Renault, Thomas Sicheritz-Ponten, Keith Turner, Hongmei Zhu, Chang Yu, Shengting Li, Min Jian, Yan Zhou, Yingrui Li, Xiuqing Zhang, Songgang Li, Nan Qin, Huanming Yang, Jian Wang, Søren Brunak, Joel Doré, Francisco Guarner, Karsten Kristiansen, Oluf Pedersen, Julian Parkhill, Jean Weissenbach, MetaHIT Consortium, Peer Bork, S Dusko Ehrlich, Jun Wang, Maria Antolin, François Artiguenave, Hervé Blottiere, Natalia Borruel, Thomas Bruls, Francesc Casellas, Christian Chervaux, Antonella Cultrone, Christine Delorme, Gérard Denariaz, Rozenn Dervyn, Miguel Forte, Carsten Friss, Maarten van de Guchte, Eric Guedon, Florence Haimet, Alexandre Jamet, Catherine Juste, Ghalia Kaci, Michiel Kleerebezem, Jan Knol, Michel Kristensen, Severine Layec, Karine Le Roux, Marion Leclerc, Emmanuelle Maguin, Raquel Melo Minardi, Raish Oozeer, Maria Rescigno, Nicolas Sanchez, Sebastian Tims, Toni Torrejon, Encarna Varela, Willem de Vos, Yohanan Winogradsky, Erwin Zoetendal, Junjie Qin, Ruiqiang Li, Jeroen Raes, Manimozhiyan Arumugam, Kristoffer Solvsten Burgdorf, Chaysavanh Manichanh, Trine Nielsen, Nicolas Pons, Florence Levenez, Takuji Yamada, Daniel R Mende, Junhua Li, Junming Xu, Shaochuan Li, Dongfang Li, Jianjun Cao, Bo Wang, Huiqing Liang, Huisong Zheng, Yinlong Xie, Julien Tap, Patricia Lepage, Marcelo Bertalan, Jean-Michel Batto, Torben Hansen, Denis Le Paslier, Allan Linneberg, H Bjørn Nielsen, Eric Pelletier, Pierre Renault, Thomas Sicheritz-Ponten, Keith Turner, Hongmei Zhu, Chang Yu, Shengting Li, Min Jian, Yan Zhou, Yingrui Li, Xiuqing Zhang, Songgang Li, Nan Qin, Huanming Yang, Jian Wang, Søren Brunak, Joel Doré, Francisco Guarner, Karsten Kristiansen, Oluf Pedersen, Julian Parkhill, Jean Weissenbach, MetaHIT Consortium, Peer Bork, S Dusko Ehrlich, Jun Wang, Maria Antolin, François Artiguenave, Hervé Blottiere, Natalia Borruel, Thomas Bruls, Francesc Casellas, Christian Chervaux, Antonella Cultrone, Christine Delorme, Gérard Denariaz, Rozenn Dervyn, Miguel Forte, Carsten Friss, Maarten van de Guchte, Eric Guedon, Florence Haimet, Alexandre Jamet, Catherine Juste, Ghalia Kaci, Michiel Kleerebezem, Jan Knol, Michel Kristensen, Severine Layec, Karine Le Roux, Marion Leclerc, Emmanuelle Maguin, Raquel Melo Minardi, Raish Oozeer, Maria Rescigno, Nicolas Sanchez, Sebastian Tims, Toni Torrejon, Encarna Varela, Willem de Vos, Yohanan Winogradsky, Erwin Zoetendal

Abstract

To understand the impact of gut microbes on human health and well-being it is crucial to assess their genetic potential. Here we describe the Illumina-based metagenomic sequencing, assembly and characterization of 3.3 million non-redundant microbial genes, derived from 576.7 gigabases of sequence, from faecal samples of 124 European individuals. The gene set, approximately 150 times larger than the human gene complement, contains an overwhelming majority of the prevalent (more frequent) microbial genes of the cohort and probably includes a large proportion of the prevalent human intestinal microbial genes. The genes are largely shared among individuals of the cohort. Over 99% of the genes are bacterial, indicating that the entire cohort harbours between 1,000 and 1,150 prevalent bacterial species and each individual at least 160 such species, which are also largely shared. We define and describe the minimal gut metagenome and the minimal gut bacterial genome in terms of functions present in all individuals and most bacteria, respectively.

Figures

Figure 1. Coverage of human gut microbiome
Figure 1. Coverage of human gut microbiome
The three human microbial sequencing read sets, Illumina GA reads generated from 124 individuals in this study (black; n=124), Roche/454 reads from 18 human twins and their mothers (grey; n=18), and Sanger reads from 13 Japanese individuals (white; n=13) were aligned to each of the reference sequence sets. Mean values ± s.e.m. are plotted.
Figure 2. Predicted ORFs in the human…
Figure 2. Predicted ORFs in the human gut microbiomes
a, Number of unique genes as function of the extent of sequencing. The gene accumulation curve corresponds to the Sobs (Mao Tau) values, calculated using EstimateS(version 8.2.0) on randomly chosen 100 samples (due to memory limitation). b, Coverage of genes from 89 frequent gut microbial species (Supplementary Table 12). c, Number of functions captured by number of samples investigated, based upon known (well characterized) orthologous groups (OGs; bottom), known+unknown orthologous groups (including e.g. putative, predicted, conserved hypothetical functions; center) and OGs+novel gene families (>20 proteins) recovered from the metagenome (top).
Figure 3. Relative abundance of frequent microbial…
Figure 3. Relative abundance of frequent microbial genomes among individuals of the cohort
Boxes denote 25% and 75% percentiles, the black line in the box corresponds to the median, the “whiskers” indicate the interquartile range from either or both ends of the box, the dots show the outliers, beyond the ends of the whiskers (See supplementary Methods for computation).
Figure 4. Bacterial species abundance differentiates IBD…
Figure 4. Bacterial species abundance differentiates IBD patients and healthy individuals
Principal component analysis based on the abundance of 155 species with ≥1% genome coverage by the Illumina reads in at least 1 individual of the cohort was carried out with 14 healthy individuals and 25 IBD patients from Spain.
Figure 5. Clusters that contain the B.…
Figure 5. Clusters that contain the B. subtilis essential genes
The clusters were ranked by the number of genes they contain, normalized by average length and copy number (see Supplementary Fig. 10) and the proportion of clusters with the essential B. subtilis genes was determined for successive groups of 100 clusters. Range indicates the part of the cluster distribution that contains 86 % of the B. subtilis essential genes.
Figure 6. Characterization of the minimal gut…
Figure 6. Characterization of the minimal gut genome and metagenome
a, Projection of the minimal gut genome on the KEGG pathways using the Ipath tool. b, Functional composition of the minimal gut genome and metagenome. c, Estimation of the minimal gut metagenome size. Known orthologous groups (OGs; red), known+unknown OGs (blue) and OGs+novel gene families (>20 proteins; grey). Inset: Composition of the gut minimal microbiome. Large circle: Classification in the minimal metagenome according to OG occurrence in STRING7 bacterial genomes. Common (25%), uncommon (35%) and rare (45%) are present in >50%, <50% but >10% and <10% of genomes, respectively. Small circle: composition of the rare OGs. Unknown (80%) have no annotation or are poorly characterized, while known bacterial (19%) and phage-related (1%) OGs have functional description.

Source: PubMed

3
Subscribe