Dynamic molecular changes during the first week of human life follow a robust developmental trajectory

Amy H Lee, Casey P Shannon, Nelly Amenyogbe, Tue B Bennike, Joann Diray-Arce, Olubukola T Idoko, Erin E Gill, Rym Ben-Othman, William S Pomat, Simon D van Haren, Kim-Anh Lê Cao, Momoudou Cox, Alansana Darboe, Reza Falsafi, Davide Ferrari, Daniel J Harbeson, Daniel He, Cai Bing, Samuel J Hinshaw, Jorjoh Ndure, Jainaba Njie-Jobe, Matthew A Pettengill, Peter C Richmond, Rebecca Ford, Gerard Saleu, Geraldine Masiria, John Paul Matlam, Wendy Kirarock, Elishia Roberts, Mehrnoush Malek, Guzmán Sanchez-Schmitz, Amrit Singh, Asimenia Angelidou, Kinga K Smolen, EPIC Consortium, Ryan R Brinkman, Al Ozonoff, Robert E W Hancock, Anita H J van den Biggelaar, Hanno Steen, Scott J Tebbutt, Beate Kampmann, Ofer Levy, Tobias R Kollmann, Diana Vo, Ken Kraft, Kerry McEnaney, Sofia Vignolo, Arnaud Marchant, Amy H Lee, Casey P Shannon, Nelly Amenyogbe, Tue B Bennike, Joann Diray-Arce, Olubukola T Idoko, Erin E Gill, Rym Ben-Othman, William S Pomat, Simon D van Haren, Kim-Anh Lê Cao, Momoudou Cox, Alansana Darboe, Reza Falsafi, Davide Ferrari, Daniel J Harbeson, Daniel He, Cai Bing, Samuel J Hinshaw, Jorjoh Ndure, Jainaba Njie-Jobe, Matthew A Pettengill, Peter C Richmond, Rebecca Ford, Gerard Saleu, Geraldine Masiria, John Paul Matlam, Wendy Kirarock, Elishia Roberts, Mehrnoush Malek, Guzmán Sanchez-Schmitz, Amrit Singh, Asimenia Angelidou, Kinga K Smolen, EPIC Consortium, Ryan R Brinkman, Al Ozonoff, Robert E W Hancock, Anita H J van den Biggelaar, Hanno Steen, Scott J Tebbutt, Beate Kampmann, Ofer Levy, Tobias R Kollmann, Diana Vo, Ken Kraft, Kerry McEnaney, Sofia Vignolo, Arnaud Marchant

Abstract

Systems biology can unravel complex biology but has not been extensively applied to human newborns, a group highly vulnerable to a wide range of diseases. We optimized methods to extract transcriptomic, proteomic, metabolomic, cytokine/chemokine, and single cell immune phenotyping data from <1 ml of blood, a volume readily obtained from newborns. Indexing to baseline and applying innovative integrative computational methods reveals dramatic changes along a remarkably stable developmental trajectory over the first week of life. This is most evident in changes of interferon and complement pathways, as well as neutrophil-associated signaling. Validated across two independent cohorts of newborns from West Africa and Australasia, a robust and common trajectory emerges, suggesting a purposeful rather than random developmental path. Systems biology and innovative data integration can provide fresh insights into the molecular ontogeny of the first week of life, a dynamic developmental phase that is key for health and disease.

Conflict of interest statement

O.L. is a named inventor on patents regarding bactericidal/permeability increasing protein (BPI), including “Therapeutic uses of BPI protein products in BPI-deficient humans” (WO2000059531A3) and “BPI and its congeners as radiation mitigators and radiation protectors” (WO2012138839A1). R.R.B. has ownership interest in Cytapex Bioinformatics Inc. The remaining authors declare no competing interests.

Figures

Fig. 1
Fig. 1
Sample processing overview. Thirty newborns were recruited in The Gambia, with each newborn providing a peripheral blood sample on DOL0 and subsets of ten newborns each providing a second peripheral blood sample at either DOL1, 3 or 7, resulting in a total of 60 blood samples. Newborn peripheral venous blood was drawn directly into heparinized collection tubes. Aliquots (200 μl) were removed for transcriptomic analysis. Plasma was then harvested from the remaining whole blood after a spin, and cryopreserved for cytokine, proteomic and metabolomic analyses. The remaining cellular fraction was diluted with phosphate-buffered saline (PBS) to replace the volume of plasma removed, and 100 μl aliquots from this mixture were processed for single-cell immunophenotyping by flow cytometry. With a starting volume of 1 ml, this standard operating protocol still left the cellular fraction contained in 400 µl of starting blood volume that could be used for other analyses. DOL: day of life
Fig. 2
Fig. 2
Indexing cellular and soluble immune markers revealed developmental progression over the first week of life. a, b Principal component analysis was used to plot cellular composition (a) and plasma cytokines/chemokine concentration (b) for each sample; this highlighted the substantial variability between participants and lack of defined clustering by DOL due to higher influence of individual variance over ontogeny. c, d Accounting for repeat measures from the same individual across different sampling days compared to DOL0 (indexing to DOL0) revealed sample clustering by DOL between samples. e, f Normalized cell counts showing developmental trajectories for cell populations that significantly changed (e) or did not change (f) over the first week of life. g, h Normalized plasma cytokine/chemokine concentrations showing developmental trajectories for cytokines/chemokines that significantly changed (g) or did not change (h) over the first week of life. Boxplots display medians with lower and upper hinges representing first and third quartiles. Whiskers reach the highest and lowest values, no more than 1.5× interquartile range from the hinge. ****p ≤ 0.0001, ***p ≤ 0.001, **p ≤ 0.01, *p ≤ 0.05, ns p > 0.05, Kruskal−Wallis test, Benjamini−Hochberg adjusted p values. DOL: day of life
Fig. 3
Fig. 3
Transcriptomic, proteomic, and metabolomic analyses identified a robust trajectory of differentially expressed genes, proteins, and metabolites over the first week of life. a Up- and downregulated differentially expressed genes were plotted by DOL (vs. DOL0) and numbers of genes are listed above each point except for downregulated genes at DOL1 vs. DOL0, where the number was zero. b, c Up- and downregulated differentially expressed proteins and metabolites, respectively, plotted by DOL compared to DOL0, with numbers of differentially expressed proteins or metabolites listed above each point. d Zero-order interaction networks for genes differentially expressed at DOL3 vs. DOL0 and DOL7 vs. DOL0. Within networks, upregulated nodes are displayed in red and downregulated nodes in blue. DOL: day of life
Fig. 4
Fig. 4
Integration of multiple data types via NetworkAnalyst molecular interaction networks provided novel biological insights. Minimum-connected networks for DOL3 vs. DOL0 (a) and DOL7 vs. DOL0 (b), respectively, containing all three individual data types, where nodes derived from the transcriptome are shown in blue, nodes from the metabolome in red, and nodes from the proteome in green. Novel nodes, which are nodes that only appeared after integrating the three data types but are not present in the individual minimum network, are shown in orange. DOL: day of life
Fig. 5
Fig. 5
DIABLO uncovered biologically relevant features by integrating information across data types. Schematic representation of two contrasting integration approaches using multivariate techniques: a shows that DIABLO selects features jointly across data types, resulting in the identification of features with strong associations across data types. Conversely, as shown in b, ensembles of multivariate models, constructed independently of each other, result in a selection of features that are poorly associated across data types. This is visualized in correlation heatmaps of the selected features and corresponding networks, with dense subgraphs, or network modules, encircled. In particular, the network modules identified in (a) include a number of features selected from all data types. This is not the case in b. The minimal set of features selected by DIABLO across data types as shown in c could discriminate between DOL and distinct sets of these features separated DOL0 from all other DOLs (DIABLO component 1) and DOL1, 3, and 7 from each other (DIABLO component 2). Features identified by DIABLO (blue bars) were largely distinct from those identified by more traditional single-OMICs multivariate approaches (red bars; overlaps in gray); shown in d using an UpSet plot. Moreover, features identified by DIABLO were more strongly enriched for known biological (functional) pathways; shown in e using an UpSet plot (blue vs. red bars). Horizontal bars are mapped to the number of elements in each set of features being compared. Vertical bars correspond to the number of elements in the intersections when carrying out various set comparisons. DIABLO: Data Integration Analysis for Biomarker discovery using Latent cOmponents, DOL: day of life
Fig. 6
Fig. 6
Independent validation and data meta-integration of the robust developmental trajectory during the first week of life. Generalizability of the multivariate integrative model (DIABLO) depicted in Fig. 5 based on data from Gambian newborns was evaluated by assessing its ability to classify DOL from OMICs profiles in a new set of validation samples collected from newborns from a second site (Papua New Guinea (PNG)). a Pathway enrichments of Molecular Interaction Networks Integration, DIABLO and MMRN identified congruent functional pathways of the first week of life. b The dashed line corresponds to the 95% confidence level ellipses for the scores obtained from the Gambia training data. Samples from the PNG site generally resided within the correct ellipse, demonstrating good agreement between actual DOL and DOL as predicted by the model. Similar figures were generated for other OMICs data (Supplementary Figure 10). c This agreement was quantified using area under the receiver operator characteristics curve (AUROC) analysis comparing DOL0 (red), 1 (blue), 3 (green), and 7 (purple) individually vs. all other DOLs combined. d shows zero-order interaction networks for DOL7 vs. DOL0 containing nodes for transcriptome (blue), proteome (green), metabolome (red), and DIABLO-selected features (purple). Genes involved in the interferon and complement pathways and neutrophil degranulation are highlighted by the orange boxes. eg Relative abundance of a selected subset of markers identified by DIABLO are shown for each DOL for both the Gambian cohort, on which the model was trained, and the validation cohort from PNG. The cells (flow cytometry; FC), plasma cytokines (Luminex assay; CYT) and plasma proteins (mass-spectrometry proteomics; PROT), transcripts (RNA-Seq; RNA), and metabolites (mass-spectrometry metabolomics; META) identified by DIABLO were associated with interferon signaling (e), neutrophil recruitment and activation (f), and complement pathways (g). The differences observed between DOLs in the Gambia cohort were generally replicated in the PNG cohort. Boxplots display medians with lower and upper hinges representing first and third quartiles; whiskers reach the highest and lowest values no more than 1.5× interquartile range from the hinge ****p ≤ 0.0001, ***p ≤ 0.001, **p ≤ 0.01, *p ≤ 0.05, ns p > 0.05, by ANOVA. DIABLO: Data Integration Analysis for Biomarker discovery using Latent cOmponents, DOL: day of life, MMRN: multiscale, multifactorial response network

References

    1. Kollmann TR, Kampmann B, Mazmanian SK, Marchant A, Levy O. Protecting the newborn and young infant from infectious diseases: lessons from immune ontogeny. Immunity. 2017;46:350–363.
    1. Balbus JM, et al. Early-life prevention of non-communicable diseases. Lancet. 2013;381:3–4.
    1. Chaussabel D, Pulendran B. A vision and a prescription for big data-enabled medicine. Nat. Immunol. 2015;16:435–439.
    1. Olin A, et al. Stereotypic immune system development in newborn children. Cell. 2018;174:1277–1292.e1214.
    1. Jennewein MF, Butler AL, Alter G. Neonate-omics: charting the unknown immune response in early life. Cell. 2018;174:1051–1053.
    1. Amenyogbe N, Levy O, Kollmann TR. Systems vaccinology: a promise for the young and the poor. Philos. Trans. R. Soc. Lond. B Biol. Sci. 2015;370:20140340.
    1. Li S, et al. Metabolic phenotypes of response to vaccination in humans. Cell. 2017;169:862–877.e817.
    1. Howie SR. Blood sample volumes in child health research: review of safe limits. Bull. World Health Organ. 2011;89:46–53.
    1. Tsang JS. Utilizing population variation, vaccination, and systems biology to study human immunology. Trends Immunol. 2015;36:479–493.
    1. Carr EJ, et al. The cellular composition of the human immune system is shaped by age and cohabitation. Nat. Immunol. 2016;17:461–468.
    1. Smolen KK, et al. Single-cell analysis of innate cytokine responses to pattern recognition receptor stimulation in children across four continents. J. Immunol. 2014;193:3003–3012.
    1. Tsang JS, et al. Global analyses of human immune variation reveal baseline predictors of postvaccination responses. Cell. 2014;157:499–513.
    1. Shannon CP, et al. Two-stage, in silico deconvolution of the lymphocyte compartment of the peripheral whole blood transcriptome in the context of acute kidney allograft rejection. PLoS ONE. 2014;9:e95224.
    1. Smith CL, et al. Identification of a human neonatal immune-metabolic network associated with bacterial infection. Nat. Commun. 2014;5:4649.
    1. Henry E, Christensen RD. Reference intervals in neonatal hematology. Clin. Perinatol. 2015;42:483–497.
    1. McCallie KR, et al. Skin-to-skin contact after birth and the natural course of neurosteroid levels in healthy term newborns. J. Perinatol. 2017;37:591–595.
    1. Xia J, Gill EE, Hancock RE. NetworkAnalyst for statistical, visual and network-based meta-analysis of gene expression data. Nat. Protoc. 2015;10:823–844.
    1. Le Cao KA, Boitard S, Besse P. Sparse PLS discriminant analysis: biologically relevant feature selection and graphical displays for multiclass problems. BMC Bioinforma. 2011;12:253.
    1. Singh, A. et al. DIABLO—an integrative, multi-omics, multivariate method for multi-group classification. Preprint at . (2018).
    1. Rohart F, Gautier B, Singh A, Le Cao KA. mixOmics: an R package for ‘omics feature selection and multiple data integration. PLoS Comput. Biol. 2017;13:e1005752.
    1. Chaussabel D, Baldwin N. Democratizing systems immunology with modular transcriptional repertoire analyses. Nat. Rev. Immunol. 2014;14:271–280.
    1. Breuer K, et al. InnateDB: systems biology of innate immunity and beyond--recent updates and continuing curation. Nucleic Acids Res. 2013;41:D1228–D1233.
    1. Tibshirani R. Regression shrinkage and selection via the Lasso. J. R. Stat. Soc. B (Methodol.) 1996;58:267–288.
    1. Ismail AA, Walker PL, Macfaul R, Gindal B. Diagnostic value of serum testosterone measurement in infancy: two case reports. Ann. Clin. Biochem. 1989;26(Pt 3):259–261.
    1. Romero-Moya D, et al. Cord blood-derived CD34+hematopoietic cells with low mitochondrial mass are enriched in hematopoietic repopulating stem cell function. Haematologica. 2013;98:1022–1029.
    1. Lugo B, Ford HR, Grishin A. Molecular signaling in necrotizing enterocolitis: regulation of intestinal COX-2 expression. J. Pediatr. Surg. 2007;42:1165–1171.
    1. Reinebrant, H. E. et al. Cyclo-oxygenase (COX) inhibitors for treating preterm labour. Cochrane Database Syst. Rev. 10.1002/14651858.CD001992.pub3 (2015).
    1. Fan J, Lv J. A selective overview of variable selection in high dimensional feature space. Stat. Sin. 2010;20:101–148.
    1. Bennike TB, et al. A cost-effective high-throughput plasma and serum proteomics workflow enables mapping of the molecular impact of total pancreatectomy with islet autotransplantation. J. Proteome Res. 2018;17:1983–1992.
    1. Liu P, Hwang JT. Quick calculation for sample size while controlling false discovery rate with application to microarray analysis. Bioinformatics. 2007;23:739–746.
    1. Westerhuis JA, van Velzen EJ, Hoefsloot HC, Smilde AK. Multivariate paired data analysis: multilevel PLSDA versus OPLSDA. Metab.: Off. J. Metab. Soc. 2010;6:119–128.
    1. Lee JA, et al. MIFlowCyt: the minimum information about a Flow Cytometry Experiment. Cytometry. 2008;73:926–930.
    1. Hahne F, et al. flowCore: a Bioconductor package for high throughput flow cytometry. BMC Bioinforma. 2009;10:106.
    1. Malek M, et al. flowDensity: reproducing manual gating of flow cytometry data by automated density-based cell population identification. Bioinformatics. 2015;31:606–607.
    1. O’Neill K, Jalali A, Aghaeepour N, Hoos H, Brinkman RR. Enhanced flowType/RchyOptimyx: a BioConductor pipeline for discovery in high-dimensional cytometry data. Bioinformatics. 2014;30:1329–1330.
    1. Ewels P, Magnusson M, Lundin S, Kaller M. MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics. 2016;32:3047–3048.
    1. Dobin A, et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics. 2013;29:15–21.
    1. Anders S, Pyl PT, Huber W. HTSeq—a Python framework to work with high-throughput sequencing data. Bioinformatics. 2015;31:166–169.
    1. Love MI, Huber W, Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014;15:550.
    1. Foroushani AB, Brinkman FS, Lynn DJ. Pathway-GPS and SIGORA: identifying relevant pathways based on the over-representation of their gene-pair signatures. PeerJ. 2013;1:e229.
    1. Berger ST, et al. MStern blotting-high throughput polyvinylidene fluoride (PVDF) membrane-based proteomic sample preparation for 96-well plates. Mol. Cell. Proteom. 2015;14:2814–2823.
    1. Bennike TB, Steen H. High-throughput parallel proteomic sample preparation using 96-well polyvinylidene fluoride (PVDF) membranes and C18 purification plates. Methods Mol. Biol. 2017;1619:395–402.
    1. Cox J, et al. Accurate proteome-wide label-free quantification by delayed normalization and maximal peptide ratio extraction, termed MaxLFQ. Mol. Cell. Proteom. 2014;13:2513–2526.
    1. Bennike T, et al. A normative study of the synovial fluid proteome from healthy porcine knee joints. J. Proteome Res. 2014;13:4377–4387.
    1. Bennike TB, et al. Proteome analysis of rheumatoid arthritis gut mucosa. J. Proteome Res. 2017;16:346–354.
    1. Johnson WE, Li C, Rabinovic A. Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics. 2007;8:118–127.
    1. Camargo A, Azuaje F, Wang H, Zheng H. Permutation-based statistical tests for multiple hypotheses. Source Code Biol. Med. 2008;3:15.
    1. Croft D, et al. The Reactome pathway knowledgebase. Nucleic Acids Res. 2014;42:D472–D477.
    1. Shannon P, et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 2003;13:2498–2504.
    1. Evans AM, DeHaven CD, Barrett T, Mitchell M, Milgram E. Integrated, nontargeted ultrahigh performance liquid chromatography/electrospray ionization tandem mass spectrometry platform for the identification and relative quantification of the small-molecule complement of biological systems. Anal. Chem. 2009;81:6656–6667.
    1. Ogata H, Goto S, Fujibuchi W, Kanehisa M. Computation with the KEGG pathway database. Biosystems. 1998;47:119–128.
    1. Wishart DS, et al. HMDB 3.0—The Human Metabolome Database in 2013. Nucleic Acids Res. 2013;41:D801–D807.
    1. Smoot ME, Ono K, Ruscheinski J, Wang PL, Ideker T. Cytoscape 2.8: new features for data integration and network visualization. Bioinformatics. 2011;27:431–432.
    1. Rohart, F., Gautier, B., Singh, A. & Le Cao, K.-A. mixOmics: an R package for ‘omics feature selection and multiple data integration. Preprint at 10.1101/108597 (2017).
    1. Caspi R, et al. The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of Pathway/Genome Databases. Nucleic Acids Res. 2014;42:D459–D471.
    1. Hinshaw SJ, Lee AHY, Gill EE, Hancock REW. MetaBridge: enabling network-based integrative analysis via direct protein interactors of metabolites. Bioinformtics. 2018;34:3225–3227.
    1. Aitchison J. The statistical analysis of compositional data. J. R. Stat. Soc. B (Methodol.) 1982;44:139–177.
    1. Liquet B, Le Cao KA, Hocini H, Thiebaut R. A novel approach for biomarker selection and the integration of repeated measures experiments from two assays. BMC Bioinforma. 2012;13:325.
    1. Singh A, et al. Identifying molecular mechanisms of the late-phase asthmatic response by integrating cellular, gene, and metabolite levels in blood. Ann. Am. Thorac. Soc. 2016;13(Suppl 1):S98.
    1. Subramanian A, et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc. Natl. Acad. Sci. USA. 2005;102:15545–15550.
    1. Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. B (Methodol.) 1995;57:289–300.
    1. Li S, Rouphael N, Duraisingham S, Romero-Steiner S, Presnell S. Molecularsignatures of antibody responses derived from a systems biology study of fivehuman vaccines. Nat. Immunol. 2014;15:195–204.
    1. Langfelder P, Horvath S. Eigengene networks for studying the relationships between co-expression modules. Bmc Syst. Biol. 2007;1:54.
    1. Ward JH. Hierarchical grouping to optimize an objective function. J. Am. Stat. Assoc. 1963;58:236–244.
    1. Thorndike RL. Who belongs in the family? Psychometrika. 1953;18:267–276.
    1. Shannon CP, et al. SABRE: a method for assessing the stability of gene modules in complex tissues and subject populations. BMC Bioinforma. 2016;17:460.
    1. Wu D, Smyth GK. Camera: a competitive gene set test accounting for inter-gene correlation. Nucleic Acids Res. 2012;40:e133.
    1. Yu G, He QY. ReactomePA: an R/Bioconductor package for reactome pathway analysis and visualization. Mol. Biosyst. 2016;12:477–479.
    1. Yu G, Wang LG, Han Y, He QY. clusterProfiler: an R package for comparing biological themes among gene clusters. OMICS. 2012;16:284–287.

Source: PubMed

3
Prenumerera