AACR Project GENIE: Powering Precision Medicine through an International Consortium

AACR Project GENIE Consortium, Fabrice André, Monica Arnedos, Alexander S Baras, José Baselga, Philippe L Bedard, Michael F Berger, Mariska Bierkens, Fabien Calvo, Ethan Cerami, Debyani Chakravarty, Kristen K Dang, Nancy E Davidson, Catherine Del Vecchio Fitz, Semih Dogan, Raymond N DuBois, Matthew D Ducar, P Andrew Futreal, Jianjiong Gao, Francisco Garcia, Stu Gardos, Christopher D Gocke, Benjamin E Gross, Justin Guinney, Zachary J Heins, Stephanie Hintzen, Hugo Horlings, Jan Hudeček, David M Hyman, Suzanne Kamel-Reid, Cyriac Kandoth, Walter Kinyua, Priti Kumari, Ritika Kundra, Marc Ladanyi, Céline Lefebvre, Michele L LeNoue-Newton, Eva M Lepisto, Mia A Levy, Neal I Lindeman, James Lindsay, David Liu, Zhibin Lu, Laura E MacConaill, Ian Maurer, David S Maxwell, Gerrit A Meijer, Funda Meric-Bernstam, Christine M Micheel, Clinton Miller, Gordon Mills, Nathanael D Moore, Petra M Nederlof, Larsson Omberg, John A Orechia, Ben Ho Park, Trevor J Pugh, Brendan Reardon, Barrett J Rollins, Mark J Routbort, Charles L Sawyers, Deborah Schrag, Nikolaus Schultz, Kenna R Mills Shaw, Priyanka Shivdasani, Lillian L Siu, David B Solit, Gabe S Sonke, Jean Charles Soria, Parin Sripakdeevong, Natalie H Stickle, Thomas P Stricker, Shawn M Sweeney, Barry S Taylor, Jelle J Ten Hoeve, Stacy B Thomas, Eliezer M Van Allen, Laura J Van 't Veer, Tony van de Velde, Harm van Tinteren, Victor E Velculescu, Carl Virtanen, Emile E Voest, Lucy L Wang, Chetna Wathoo, Stuart Watt, Celeste Yu, Thomas V Yu, Emily Yu, Ahmet Zehir, Hongxin Zhang, AACR Project GENIE Consortium, Fabrice André, Monica Arnedos, Alexander S Baras, José Baselga, Philippe L Bedard, Michael F Berger, Mariska Bierkens, Fabien Calvo, Ethan Cerami, Debyani Chakravarty, Kristen K Dang, Nancy E Davidson, Catherine Del Vecchio Fitz, Semih Dogan, Raymond N DuBois, Matthew D Ducar, P Andrew Futreal, Jianjiong Gao, Francisco Garcia, Stu Gardos, Christopher D Gocke, Benjamin E Gross, Justin Guinney, Zachary J Heins, Stephanie Hintzen, Hugo Horlings, Jan Hudeček, David M Hyman, Suzanne Kamel-Reid, Cyriac Kandoth, Walter Kinyua, Priti Kumari, Ritika Kundra, Marc Ladanyi, Céline Lefebvre, Michele L LeNoue-Newton, Eva M Lepisto, Mia A Levy, Neal I Lindeman, James Lindsay, David Liu, Zhibin Lu, Laura E MacConaill, Ian Maurer, David S Maxwell, Gerrit A Meijer, Funda Meric-Bernstam, Christine M Micheel, Clinton Miller, Gordon Mills, Nathanael D Moore, Petra M Nederlof, Larsson Omberg, John A Orechia, Ben Ho Park, Trevor J Pugh, Brendan Reardon, Barrett J Rollins, Mark J Routbort, Charles L Sawyers, Deborah Schrag, Nikolaus Schultz, Kenna R Mills Shaw, Priyanka Shivdasani, Lillian L Siu, David B Solit, Gabe S Sonke, Jean Charles Soria, Parin Sripakdeevong, Natalie H Stickle, Thomas P Stricker, Shawn M Sweeney, Barry S Taylor, Jelle J Ten Hoeve, Stacy B Thomas, Eliezer M Van Allen, Laura J Van 't Veer, Tony van de Velde, Harm van Tinteren, Victor E Velculescu, Carl Virtanen, Emile E Voest, Lucy L Wang, Chetna Wathoo, Stuart Watt, Celeste Yu, Thomas V Yu, Emily Yu, Ahmet Zehir, Hongxin Zhang

Abstract

The AACR Project GENIE is an international data-sharing consortium focused on generating an evidence base for precision cancer medicine by integrating clinical-grade cancer genomic data with clinical outcome data for tens of thousands of cancer patients treated at multiple institutions worldwide. In conjunction with the first public data release from approximately 19,000 samples, we describe the goals, structure, and data standards of the consortium and report conclusions from high-level analysis of the initial phase of genomic data. We also provide examples of the clinical utility of GENIE data, such as an estimate of clinical actionability across multiple cancer types (>30%) and prediction of accrual rates to the NCI-MATCH trial that accurately reflect recently reported actual match rates. The GENIE database is expected to grow to >100,000 samples within 5 years and should serve as a powerful tool for precision cancer medicine.Significance: The AACR Project GENIE aims to catalyze sharing of integrated genomic and clinical datasets across multiple institutions worldwide, and thereby enable precision cancer medicine research, including the identification of novel therapeutic targets, design of biomarker-driven clinical trials, and identification of genomic determinants of response to therapy. Cancer Discov; 7(8); 818-31. ©2017 AACR.See related commentary by Litchfield et al., p. 796This article is highlighted in the In This Issue feature, p. 783.

Conflict of interest statement

Disclosure of Potential Conficts of Interest: F. André reports receiving commercial research grants from Astra-Zeneca, Lilly, Novartis, and Pfzer. M. Arnedos has received honoraria from the speakers bureaus of Novartis and AstraZeneca, and is a consultant/advisory board member for Puma. D.M. Hyman reports receiving commercial research grants from AstraZeneca, Loxo Oncology, and PUMA Biotechnology, and is a consultant/advisory board member for Atara Biotherapeutics, Chugai, and CytomX. M.A. Levy is an advisory board member of Personalis, Inc., and receives royalty distribution from GenomOncology. F. Meric-Bernstam reports receiving commercial research grants from Aileron, AstraZeneca, Bayer, Calithera, Curis, CytoMx, Debiopharma, Effective Pharma, Genentech, Jounce, Novartis, PUMA, Taiho, and Zymeworks, and is a consultant/advisory board member for Clearlight Diagnostics, Darwin Health, Dialecta, GRAIL, Infection Biosciences, and Pieris. G.B. Mills reports receiving commercial research grants from Adelson Medical Research Foundation, AstraZeneca, Breast Cancer Research Foundation, Critical Outcome Technologies, Illumina, Karus, Komen Research Foundation, NanoString, and Takeda/Millennium Pharmaceuticals; has received honoraria from the speakers bureaus of Allostery, AstraZeneca, ImmunoMet, ISIS Pharmaceuticals, Lilly, MedImmune, Novartis, Pfzer, Symphogen, and Tarveda; has ownership interest (including patents) in Catena Pharmaceuticals, ImmunoMet, Myriad Genetics, PTV Ventures, and Spindletop Ventures; and is a consultant/advisory board member for Adventist Health, Allostery, AstraZeneca, Catena Pharmaceuticals, Critical Outcome Technologies, ImmunoMet, ISIS Pharmaceuticals, Lilly, MedImmune, Novartis, Precision Medicine, Provista Diagnostics, Signalchem Lifesciences, Symphogen, Takeda/Millennium Pharmaceuticals, Tarveda, and Tau Therapeutics. T.J. Pugh is a consultant/advisory board member for Dynacare. C.L. Sawyers is a consultant/advisory board member for Novartis. V.E. Velculescu has ownership interest (including patents) in Personal Genome Diagnostics and is a consultant/advisory board member for the same. No potential conflicts of interest were disclosed by the other authors.\One of the Editors-in-Chief is an author on this article. In keeping with the AACR's editorial policy, the peer review of this submission was managed by a senior member of Cancer Discovery's editorial team; a member of the AACR Publications Committee rendered the final decision concerning acceptability.

©2017 American Association for Cancer Research.

Figures

Figure 1
Figure 1
AACR Project GENIE at a glance. A, Variant calls and a limited clinical dataset from patients treated at each of the participating centers are sent to the Synapse platform, developed by Sage Bionetworks, where the data are harmonized and protected health information (PHI) removed in a secure Health Insurance Portability and Accountability Act (HIPAA)-compliant environment that provides data governance. Once harmonized, these data are viewed and analyzed in the cBioPortal for Cancer Genomics. Value is provided to both the data generators and the consortium by establishing 6-month periods of exclusivity to each prior to the data becoming available to the broader research community. B, Once data are available in the cBioPortal, clinical research projects are proposed and vetted by the project steering committee. Clinical teams are then assembled to define the clinical attributes required to answer the approved research question; these data are then manually curated from the relevant medical records and deposited in an electronic data capture system. The detailed clinical data are then transferred to Synapse where they are linked with the appropriate genomic and limited clinical data and are viewable and analyzable in the cBioPortal platform. Again, value is created by providing a period of at least 6 months' exclusivity to both the consortium and sponsors, where relevant. The primary data are made public at the time of publication.
Figure 2
Figure 2
Landscape overview of GENIE dataset. A, The degree of overlap at the gene level across the contributing centers' genomic assays is shown. A core set of 44 genes (listed in the inlay) is represented across all genomic assays in the GENIE dataset. The 2 additional genes listed in the bottom right of the inlay in gray are genes that were common to the smaller panels, not present in some of the previous versions of the larger panels but are present on the most recent version of all panels. B, Total sample counts by tumor type and contributing center. The contribution of samples for each tumor type across the institutions in shown within each bar of the lower stacked barplot. C, Mutations (all nonsilent substitutions and small insertions/deletions reported) per coding megabase (Mbs) sequenced for each sample, stratified by tumor type, and ordered by median mutation rate in those tumor types. The data are shown as empirical cumulative distributions (blue-shaded area) with individual samples shown as points colored black to red for low to high mutation burden, respectively. These data are limited to the 14,310 samples analyzed by the larger gene panels used at centers DFCI, MSK, and VICC.
Figure 3
Figure 3
Genomic alterations in non-small cell lung cancer, breast cancer, and colorectal cancer. A-C, The genomic alteration rate (including mutation, copy number, and rearrangement) aggregated to the gene level across the cohort for the top three most common tumor types is shown: non-small cell lung cancer, colorectal cancer, and breast cancer (A-C, respectively). Data for each center are shown as percentage of samples from that center with genomic alterations in a given gene. Directly adjacent to the main heat map is the proportional breakdown of the types of genomic alterations observed, and characterization of the mutation distribution observed in a given gene as oncogene and tumor suppressor, based on the normalized entropy (log2(N)-Σpilog2(pi), where N is the number of unique mutations in a given gene and pi is the proportion of mutations accounted for by a given unique mutation of a given gene) in the mutation spectrum and the prevalence of truncating and frameshift mutations, respectively. These data are limited to the gene with either: (i) 15% genomic alteration rate in at least one center, (ii) 5% genomic alteration rate in at least three centers, and (iii) OncoKB level 1 or 2A evidence for the tumor types shown. The “nc” designation in the colorbar legend indicates no coverage.
Figure 4
Figure 4
Potential clinical actionability. Tumor types are shown by decreasing overall frequency of actionability. Actionability was defined by the union of three knowledge bases: My Cancer Genome (http://mycancergenome.org), OncoKB (http://oncokb.org), and the Personalized Cancer Therapy knowledge base (http://pct.mdanderson.org). For each tumor sample, the highest level of actionability of any variant was considered. Only tumor types with 100 or more samples were included in this analysis.
Figure 5
Figure 5
Clinical trial matching. Overview of GENIE samples matched to NCI-MATCH, based on genomic and cancer type criteria. Each patient with a reported sequencing date in 2014 or later was matched against 18 arms of the study that use somatic mutations or copy-number alterations for enrollment. Arms with fusion criteria were excluded because only two of the eight contributing GENIE centers provided fusion data. A, Information regarding 18 arms of the NCI-MATCH trial, including a summary of genomic trial eligibility, and the total count of GENIE samples matched. For arms S1 and U (indicated with an asterisk), the exact set of inactivating mutations was not specified in the NCI protocol, and all mutations were therefore considered matches. B, Proportion of the matches attributed to the top 10 most frequently matched cancer types. The categories are the top-level OncoTree codes. C, Comparison of the observed matching rate in the GENIE cohort with the reported rates observed by the first 645 patients by the NCI-MATCH group. Substudies X and Z1D had not reported interim rates.

Source: PubMed

3
Sottoscrivi