Replication and refinement of a vaginal microbial signature of preterm birth in two racially distinct cohorts of US women

Benjamin J Callahan, Daniel B DiGiulio, Daniela S Aliaga Goltsman, Christine L Sun, Elizabeth K Costello, Pratheepa Jeganathan, Joseph R Biggio, Ronald J Wong, Maurice L Druzin, Gary M Shaw, David K Stevenson, Susan P Holmes, David A Relman, Benjamin J Callahan, Daniel B DiGiulio, Daniela S Aliaga Goltsman, Christine L Sun, Elizabeth K Costello, Pratheepa Jeganathan, Joseph R Biggio, Ronald J Wong, Maurice L Druzin, Gary M Shaw, David K Stevenson, Susan P Holmes, David A Relman

Abstract

Preterm birth (PTB) is the leading cause of neonatal morbidity and mortality. Previous studies have suggested that the maternal vaginal microbiota contributes to the pathophysiology of PTB, but conflicting results in recent years have raised doubts. We conducted a study of PTB compared with term birth in two cohorts of pregnant women: one predominantly Caucasian (n = 39) at low risk for PTB, the second predominantly African American and at high-risk (n = 96). We profiled the taxonomic composition of 2,179 vaginal swabs collected prospectively and weekly during gestation using 16S rRNA gene sequencing. Previously proposed associations between PTB and lower Lactobacillus and higher Gardnerella abundances replicated in the low-risk cohort, but not in the high-risk cohort. High-resolution bioinformatics enabled taxonomic assignment to the species and subspecies levels, revealing that Lactobacillus crispatus was associated with low risk of PTB in both cohorts, while Lactobacillus iners was not, and that a subspecies clade of Gardnerella vaginalis explained the genus association with PTB. Patterns of cooccurrence between L. crispatus and Gardnerella were highly exclusive, while Gardnerella and L. iners often coexisted at high frequencies. We argue that the vaginal microbiota is better represented by the quantitative frequencies of these key taxa than by classifying communities into five community state types. Our findings extend and corroborate the association between the vaginal microbiota and PTB, demonstrate the benefits of high-resolution statistical bioinformatics in clinical microbiome studies, and suggest that previous conflicting results may reflect the different risk profile of women of black race.

Keywords: Gardnerella; Lactobacillus; pregnancy; prematurity; vaginal microbiota.

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Fig. 1.
Fig. 1.
Average gestational frequencies of Lactobacillus, Gardnerella, and Ureaplasma for women who delivered preterm and at term. Each point shows the average frequency of the genus-of-interest across gestational samples from one woman. Upper shows the Stanford cohort (n = 39), and Lower shows the UAB cohort (n = 96). The significance of the differences between average gestational frequencies in term and preterm births, in the previously reported directions, were assessed by one-sided Wilcoxon rank-sum test; **P < 0.01.
Fig. 2.
Fig. 2.
Associations between Gardnerella/Lactobacillus variants and PTB in two cohorts of women. (A) Associations between average gestational frequency and PTB were tested for the nine detected Gardnerella 16S rRNA sequence variants, and for the four major Lactobacillus species in the vaginal microbiota. Testing was performed separately for the Stanford and UAB cohorts. Dashed lines indicate P = 0.05 (one-sided Wilcoxon rank-sum test). (B) The three most abundant Gardnerella sequence variants were mapped onto a phylogenetic tree constructed from sequenced G. vaginalis genomes (SI Materials and Methods). (C) MDS was performed on the Bray–Curtis distances between all samples. The two MDS dimensions that explain the greatest amounts of variation in the data are displayed. Stanford (n = 897) and UAB (n = 1,282) samples are plotted separately. Landmark samples (magenta) contained the highest proportion of the labeled taxa: G1 (66% frequency in landmark sample), G2 (96%), L. crispatus (>99%), and L. iners (>99%).
Fig. S1.
Fig. S1.
Hierarchical clustering on the Euclidean distances of the presence/absence of 1,553 orthologous genes in 17 G. vaginalis genomes. Green: genes present; black: genes absent. Cluster 1, orthologous genes present in nearly all variants; cluster 2, orthologs in G2 variants; cluster 3, orthologous genes in G2 and G3 variants; cluster 4, orthologous genes in G1 variants.
Fig. S2.
Fig. S2.
Relative contribution of functions in clade-specific G. vaginalis genes. The relative number of genes in each functional category was determined for clusters 2–4 from Fig. S1.
Fig. S3.
Fig. S3.
Statistical association between PTB and the average frequencies of Lactobacillus and Gardnerella during particular periods of gestation. The association between PTB and Lactobacillus/Gardnerella, as well as the two most abundant subgenus variants of each, was tested (one-sided Wilcoxon rank-sum test) in subsets of samples from specific time periods of gestation.
Fig. 3.
Fig. 3.
The distribution of Lactobacillus and Gardnerella in the vaginal microbiota of two cohorts of pregnant women. (A) The frequencies of Lactobacillus species and Gardnerella variants in the pooled samples from the Stanford and UAB cohorts. (B) The joint distributions of L. crispatus with Gardnerella, and L. iners with Gardnerella, stratified by cohort. The x axis shows the summed frequency of Gardnerella and the Lactobacillus species; the y axis shows the ratio between the two (log-scale).
Fig. S4.
Fig. S4.
Average gestational frequencies of STI-associated microbes. None of these taxa were significantly associated with PTB: Chlamydia P = 0.96 (one-sided Wilcoxon rank-sum test), Gonorrhea (N. gonorrhoeae) P = 0.30, Trichomonas P = 0.12, although power was limited by the small number of women in which they were observed. These STI-associated microbes were only detected in the UAB cohort.
Fig. 4.
Fig. 4.
Statistical association between the average gestational frequencies of detected genera and preterm birth in two cohorts of women. (A) We tested the association between PTB and increased gestational frequency for all detected genera (for Lactobacillus, decreased frequency) in each cohort by the Wilcoxon rank-sum test. Genera in red have a significant composite P value after controlling FDR < 0.1 (Methods). Text size scales with the square root of study-wide frequency. (B) The fraction of samples in which the four highlighted genera were present among vaginal samples and negative controls, stratified by sequencing run.
Fig. S5.
Fig. S5.
Statistical association between the average gestational frequencies of high-frequency sequence variants and preterm birth in two cohorts of women. We tested the association between PTB and increased gestational frequency for all non-Lactobacillus sequence variants with a minimum frequency of 0.001 in either the Stanford or UAB cohort by the Wilcoxon rank-sum test. Sequence variants are labeled by their assigned genus. Sequence variants with a red label have a significant composite P value after controlling FDR < 0.1 (Methods). Text size scales with the square root of study-wide frequency. The significant Gardnerella sequence variant is the G2 variant highlighted in the main text.
Fig. S6.
Fig. S6.
Statistical association between the average gestational frequencies of high-frequency genera and preterm birth in two cohorts of women. We tested the association between PTB and increased gestational frequency for all non-Lactobacillus genera with a minimum frequency of 0.001 in either the Stanford or UAB cohort by the Wilcoxon rank-sum test. Genera with a red label have a significant composite P value after controlling FDR < 0.1 (Methods). Text size scales with the square root of study-wide frequency. Note that Prevotella has a much stronger association with PTB when the data are aggregated at the level of genus than when each sequence variant is considered separately.
Fig. S7.
Fig. S7.
Sampling regimen in the Stanford and UAB cohorts. Points show vaginal swab samples. Parentheses indicate the gestational week of delivery. The median week of sampling in the Stanford cohort was 24, and the interquartile range (IQR) was 17–31 wk. The median week of sampling in the UAB cohort was 27, and the IQR was 21–32 wk.

Source: PubMed

3
Subscribe