Personal omics profiling reveals dynamic molecular and medical phenotypes

Rui Chen, George I Mias, Jennifer Li-Pook-Than, Lihua Jiang, Hugo Y K Lam, Rong Chen, Elana Miriami, Konrad J Karczewski, Manoj Hariharan, Frederick E Dewey, Yong Cheng, Michael J Clark, Hogune Im, Lukas Habegger, Suganthi Balasubramanian, Maeve O'Huallachain, Joel T Dudley, Sara Hillenmeyer, Rajini Haraksingh, Donald Sharon, Ghia Euskirchen, Phil Lacroute, Keith Bettinger, Alan P Boyle, Maya Kasowski, Fabian Grubert, Scott Seki, Marco Garcia, Michelle Whirl-Carrillo, Mercedes Gallardo, Maria A Blasco, Peter L Greenberg, Phyllis Snyder, Teri E Klein, Russ B Altman, Atul J Butte, Euan A Ashley, Mark Gerstein, Kari C Nadeau, Hua Tang, Michael Snyder, Rui Chen, George I Mias, Jennifer Li-Pook-Than, Lihua Jiang, Hugo Y K Lam, Rong Chen, Elana Miriami, Konrad J Karczewski, Manoj Hariharan, Frederick E Dewey, Yong Cheng, Michael J Clark, Hogune Im, Lukas Habegger, Suganthi Balasubramanian, Maeve O'Huallachain, Joel T Dudley, Sara Hillenmeyer, Rajini Haraksingh, Donald Sharon, Ghia Euskirchen, Phil Lacroute, Keith Bettinger, Alan P Boyle, Maya Kasowski, Fabian Grubert, Scott Seki, Marco Garcia, Michelle Whirl-Carrillo, Mercedes Gallardo, Maria A Blasco, Peter L Greenberg, Phyllis Snyder, Teri E Klein, Russ B Altman, Atul J Butte, Euan A Ashley, Mark Gerstein, Kari C Nadeau, Hua Tang, Michael Snyder

Abstract

Personalized medicine is expected to benefit from combining genomic information with regular monitoring of physiological states by multiple high-throughput methods. Here, we present an integrative personal omics profile (iPOP), an analysis that combines genomic, transcriptomic, proteomic, metabolomic, and autoantibody profiles from a single individual over a 14 month period. Our iPOP analysis revealed various medical risks, including type 2 diabetes. It also uncovered extensive, dynamic changes in diverse molecular components and biological pathways across healthy and diseased conditions. Extremely high-coverage genomic and transcriptomic data, which provide the basis of our iPOP, revealed extensive heteroallelic changes during healthy and diseased states and an unexpected RNA editing mechanism. This study demonstrates that longitudinal iPOP can be used to interpret healthy and diseased states by connecting genomic information with additional dynamic omics activity.

Copyright © 2012 Elsevier Inc. All rights reserved.

Figures

Figure 1. Summary of study
Figure 1. Summary of study
(A) Time course summary. The subject was monitored for a total of 523 days, during which there were two infections (red bar, HRV; green bar, RSV). The black bar indicates the period when the subject 1) increased exercise; 2) ingested 81 mg of Acetylsalicylic Acid and Ibuprofen tablets each day (the latter only during the first 6 weeks of this period); and 3) substantially reduced sugar intake. Blue numbers indicated fasted time points. (B) iPOP experimental design indicating the tissues analyses involved in this study. (C) Circos (Krzywinski et al., 2009) plot summarizing iPOP. From outer to inner rings: chromosome ideogram; genomic data (pale blue ring) - structural variants > 50 bp [deletions (blue tiles), duplications (red tiles)], indels (green triangles); transcriptomic data (yellow ring) – expression ratio of HRV infection to healthy states; proteomic data (light purple ring) - ratio of protein levels during HRV infection to healthy states; transcriptomic data (yellow ring) – differential heteroallelic expression ratio of alternative allele to reference allele for missense and synonymous variants (purple dots) and candidate RNA missense and synonymous edits (red triangles, purple dots, orange triangles and green triangles, respectively).
Figure 2. Medical findings
Figure 2. Medical findings
(A) High interest disease and drug related variants in the subject's genome. (B) RiskGraph of the top 20 diseases with the highest post-test probabilities. For each disease, the arrow represents the pretest probability according to the subject's age, gender, and ethnicity. The line represents the post-test probability after incorporating the subject's genome sequence. Listed to the right are the numbers of independent disease-associated SNVs used to calculate the subject's post-test probability. (C) RiskOGram of Type 2 Diabetes. The RiskOGram illustrates how the subject's post-test probability of T2D was calculated using 28 independent SNVs. The middle graph displays the post-test probability. The left side shows the associated genes, SNVs, and the subject's genotypes. The right side shows the likelihood ratio (LR), number of studies, cohort sizes, and the post-test probability. (D) Blood glucose trend. Measurements were taken from samples analyzed at either non-fasted or fasted states; the non-fasted states (all but Days 186, 322, 329 and 369 and after Day 400) were at a fixed time after a constant meal and differed from the fasted states. Data was presented as moving average with a window of 15 days. Red and green arrows and bars indicate the times of the HRV and RSV infections, respectively. Black arrows and bars indicate the period with life style changes. (E) C-Reactive Protein trendline. (F) Serum cytokine profiles. Red box and day number, HRV infection; green box and day number, RSV infection; question mark, elevated cytokine levels indicating an unknown event at Day 301. Red is increased cytokine levels.
Figure 3. Transcriptome time course analysis
Figure 3. Transcriptome time course analysis
(A) Summary of approach for identification of differentially expressed components. The various omics sets were processed through a common framework involving spectral analysis, clustering and pathway enrichment analysis. (B) Pattern Classification. The different emergent patterns from the analysis of the transcriptome for the entire time course are displayed for the autocorrelation (I), spike maxima (II) and spike minima (III) classes. For different clusters, examples of gene connections in selected pathways based on Reactome (Croft et al., 2011) FI [Cytoscape (Smoot et al., 2011) plugin] are shown as networks. Example GO (Ashburner et al., 2000) enrichment analysis results from Cytoscape (Smoot et al., 2011) BiNGO (Maere et al., 2005) plugin and pathway enrichment results [Reactome (Croft et al., 2011) FI] are included.
Figure 4. Integrated omics analysis
Figure 4. Integrated omics analysis
For Days 186–400, the different emergent patterns from an integrated analysis of the transcriptome, proteome and metabolome data are displayed for autocorrelation (I), spike maxima (II) and spike minima (III) classes. For different clusters, examples of gene connections in selected pathways based on Reactome (Croft et al., 2011) and FI Cytoscape (Smoot et al., 2011) plugin are shown as networks, with constituents marked as assessed from protein data, transcriptome data or both. Example GO (Ashburner et al., 2000) enrichment analysis results from Cytoscape (Smoot et al., 2011) BiNGO (Maere et al., 2005) plugin and pathway enrichment results [Reactome (Croft et al., 2011) FI] are included.
Figure 5. Heteroallelic expression study of PBMCs
Figure 5. Heteroallelic expression study of PBMCs
(A) Frequency of allelic specific expression (ASE) based on shrunk alternative/total ratios of RNA-Seq data. 143 positions fall outside the 3 standard deviations (σ) range (see Figure S2B) 0.66), suggesting that certain heterozygous alleles (DNA level) are preferentially expressed in PBMCs. Standard deviations (σ) are denoted with dotted lines and the average ratio overlapping across all time-points is 0.49. (B) Digital droplet PCR validation of two heteroallelic expressed genes PADI4 and PLOD (relative to alternative allele). (C) Heatmap of the HRV infection time course (7 time points) showing differential ASE during HRV infection Day 0 (red arrow) relative to average shrunk ratios of healthy states (Days 116–255). (D) Heatmap of the RSV infection time course (13 time points) showing differential ASE specific to RSV infection Day 289 (red arrow) relative to average shrunk ratios of healthy states (Days 311–400), onset of T2D on Day 307 is also shown (red arrow). Heatmap ratios are relative to the alternative allele (alternative/total, posterior probability >0.75). Example of enriched KEGG pathway gene cluster (Huang et al., 2009; Benjamini p<0.05) shown below Figure 5C. See also Figures S2, S8.
Figure 6. RNA editing and miRNA expression…
Figure 6. RNA editing and miRNA expression of PBMCs
(A) Distribution of RNA editing types in missense (red) and synonymous and UTRs (blue), based on seven or more time points (total 20 time points). (B) Selected summary of known and novel RNA edits expressed in PBMCs. RNA edits were validated by digital PCR (green) and proteomic mass spectrometry (yellow). (C) Detail of 2 missense causing edit sites in BLCAP. Selected data from RNA-Seq at Day 4 and Day 255 (top left), Sanger sequencing of Day 255 cDNA (bottom left) and digital PCR (right panel) are shown. (D) Digital droplet PCR analysis of novel edit sites in SCFD2 (left) and FBXO25 (right) genes show no variants in DNA, while in RNA, editing is evident (top left quadrant). (E and F) Expression of SNV-containing and SNV-free miRNA, respectively, for Days 4, 21, 116, 185 and 186. Red lines: mean; error bars: standard error of the mean. Genome browsers, chromatograms and digital PCR data were analyzed with software from DNAnexus Inc., Chromas Ltd. and Quantalife™, respectively. See related Figures S2, S8 and Supplementary Data.

Source: PubMed

3
Se inscrever