Whole genome metagenomic analysis of the gut microbiome of differently fed infants identifies differences in microbial composition and functional genes, including an absent CRISPR/Cas9 gene in the formula-fed cohort

Matthew D Di Guglielmo, Karl Franke, Courtney Cox, Erin L Crowgey, Matthew D Di Guglielmo, Karl Franke, Courtney Cox, Erin L Crowgey

Abstract

Background: Advancements in sequencing capabilities have enhanced the study of the human microbiome. There are limited studies focused on the gastro-intestinal (gut) microbiome of infants, particularly the impact of diet between breast-fed (BF) versus formula-fed (FF). It is unclear what effect, if any, early feeding has on short-term or long-term composition and function of the gut microbiome.

Results: Using a shotgun metagenomics approach, differences in the gut microbiome between BF (n = 10) and FF (n = 5) infants were detected. A Jaccard distance principle coordinate analysis was able to cluster BF versus FF infants based on the presence or absence of species identified in their gut microbiome. Thirty-two genera were identified as statistically different in the gut microbiome sequenced between BF and FF infants. Furthermore, the computational workflow identified 371 bacterial genes that were statistically different between the BF and FF cohorts in abundance. Only seven genes were lower in abundance (or absent) in the FF cohort compared to the BF cohort, including CRISPR/Cas9; whereas, the remaining candidates, including autotransporter adhesins, were higher in abundance in the FF cohort compared to BF cohort.

Conclusions: These studies demonstrated that FF infants have, at an early age, a significantly different gut microbiome with potential implications for function of the fecal microbiota. Interactions between the fecal microbiota and host hinted at here have been linked to numerous diseases. Determining whether these non-abundant or more abundant genes have biological consequence related to infant feeding may aid in understanding the adult gut microbiome, and the pathogenesis of obesity.

Keywords: Breast-feeding; Gut microbiome; Infants; Metagenomics; Next generation sequencing; Whole genome.

Conflict of interest statement

Declaration of Competing Interest None of the authors has a conflict of interest to report.

Figures

Fig. 1.
Fig. 1.
Metagenomic workflow for comparing breast-fed versus formula-fed infants. Bio specimen processing (top box): Subjects were enrolled into two groups; breast-fed (BF) and formula-fed (FF) and demographics are summarized in Table 1. (1) A fresh fecal sample was collected and flash frozen per subject. (2) DNA was extracted from the fecal sample and (3) used to prepare shotgun metagenomic libraries for next generation sequencing (Illumina platform). Bioinformatics: Sunbeam (middle box): Raw reads (FASTQ) were (4) quality trimmed to remove adapter sequences and low-quality bases. The cleaned FASTQ files (5) were mapped to the human genome and PhiX to remove contaminating (un-specific) or control reads. The reads from the decontaminated FASTQ (6) were classified using the Kraken database. Bioinformatics: Publicly available algorithms, custom pipeline (bottom box): The Kraken classified reads were (7) analyzed via edgeR to determine differentially represented genera, summarized in Table 2 and Figs. 2–4. The decontaminated FASTQ files were pooled (8) to create one large library for de novo assembly (MEGAHIT) of a metagenome (9), annotated with prodigal and NCBI COGs. Reads from the individual FastQs were aligned, using STAR and RSEM, to the metagenome (10). Normalized gene counts were calculated via edgeR and results are displayed in Table 2.
Fig. 2.
Fig. 2.
Principle coordinates analysis based on species level data. (A) Bray-Curtis distance plot based on species abundance per subject. Each data point represents either a breast-fed (red) or formula-fed (blue) subject. The shape of the data points represents either technical replicate 1 (circle) or technical replicate 2 (triangle). Axis 1 has a variance of 17.25% and axis 2 has a variance of 12.04%. (B) Jaccard Distance plot based on presence or absence of species per subject. Each data point represents either a breast-fed (red) or formula-fed (blue) subject. The shape of the data points represents either technical replicate 1 (circle) or technical replicate 2 (triangle). Axis 1 has a variance of 7.62% and axis 2 has a variance of 6.46%.
Fig. 3.
Fig. 3.
Distribution of genera identified in the gut microbiome of breast-fed and formula-fed infants. Left Panel: Box-plot of the top most abundant genera in breast-fed infants (red boxes). Right Panel: Box-plot of the top most abundant genera in formula-fed infants (blue boxes). The red asterisks represent the genera that were statistically different between the breast-fed and formula-fed cohorts. The y-axis represents phylogenetic abundance (percentage), and each genus is represented on the x-axis.
Fig. 4.
Fig. 4.
Cas9 Validations. InterPro was used to analyze the amino acid sequence coded for by the Cas9 gene to validate its identity (A). Non-quantitative PCR was used to validate the results of the bioinformatic analysis for Cas9 (B) as well as a Carboxypeptidase (C).

Source: PubMed

3
Iratkozz fel