The Cancer Genome Atlas Pan-Cancer analysis project

Cancer Genome Atlas Research Network, John N Weinstein, Eric A Collisson, Gordon B Mills, Kenna R Mills Shaw, Brad A Ozenberger, Kyle Ellrott, Ilya Shmulevich, Chris Sander, Joshua M Stuart

Abstract

The Cancer Genome Atlas (TCGA) Research Network has profiled and analyzed large numbers of human tumors to discover molecular aberrations at the DNA, RNA, protein and epigenetic levels. The resulting rich data provide a major opportunity to develop an integrated picture of commonalities, differences and emergent themes across tumor lineages. The Pan-Cancer initiative compares the first 12 tumor types profiled by TCGA. Analysis of the molecular aberrations and their functional roles across tumor types will teach us how to extend therapies effective in one cancer type to others with a similar genomic profile.

Figures

Figure 1. Integrated data set for the…
Figure 1. Integrated data set for the comparison and contrast of multiple tumour types
The Pan-Cancer project assembled data from thousands of patients with primary tumours occurring in different sites of the body covering twelve tumour types (upper left panel) including glioblastoma multiform (GBM), lymphoblastic acute myeloid leukemia (LAML), head and neck squamous carcinoma (HNSC), lung adenocarcinoma (LUAD), lung squamous carcinoma (LUSC), breast carcinoma (BRCA), kidney renal clear cell carcinoma (KIRC), ovarian carcinoma (OV), bladder carcinoma (BLCA), colon adenocarcinoma (COAD), uterine cervical and endometrial carcinoma (UCEC), and rectal adenocarcinoma (READ). Six platforms of omics characterizations were performed creating a “data stack” (upper right panel) in which data elements across the platforms are linked by the fact that tissue material from the same samples were assayed, thus maximizing the potential of integrative analysis. Use of the data enables the identification of general trends including common pathways (lower panel) revealing master regulatory hubs activated (red) or deactivated (blue) across different tissue types.
Figure 2. Data coordination for the Pan-Cancer…
Figure 2. Data coordination for the Pan-Cancer TCGA project
Data were collected by the biospecimen collection resource (BCR) from 12 different tumour types, characterized on six major platforms by the genome characterization and sequencing centers (GCC/GSC). Datasets are deposited into the TCGA data coordination center (DCC) from which it is then distributed to the Broad Institute's Firehose and Memorial Sloan Kettering Cancer Center's cBioPortal for various automated processing pipelines. Analysis working groups (AWG) conduct focused analyses on individual tumour types. Results from the DCC, Firehose, and AWGs were collected and stored in Sage Bionetworks’ Synapse system to create a “data freeze.” Genome data analysis centers (GDACs) accessed and deposited both data and results through Synapse to coordinate distributed analyses.

Source: PubMed

3
订阅