FACETS: allele-specific copy number and clonal heterogeneity analysis tool for high-throughput DNA sequencing

Ronglai Shen, Venkatraman E Seshan, Ronglai Shen, Venkatraman E Seshan

Abstract

Allele-specific copy number analysis (ASCN) from next generation sequencing (NGS) data can greatly extend the utility of NGS beyond the identification of mutations to precisely annotate the genome for the detection of homozygous/heterozygous deletions, copy-neutral loss-of-heterozygosity (LOH), allele-specific gains/amplifications. In addition, as targeted gene panels are increasingly used in clinical sequencing studies for the detection of 'actionable' mutations and copy number alterations to guide treatment decisions, accurate, tumor purity-, ploidy- and clonal heterogeneity-adjusted integer copy number calls are greatly needed to more reliably interpret NGS-based cancer gene copy number data in the context of clinical sequencing. We developed FACETS, an ASCN tool and open-source software with a broad application to whole genome, whole-exome, as well as targeted panel sequencing platforms. It is a fully integrated stand-alone pipeline that includes sequencing BAM file post-processing, joint segmentation of total- and allele-specific read counts, and integer copy number calls corrected for tumor purity, ploidy and clonal heterogeneity, with comprehensive output and integrated visualization. We demonstrate the application of FACETS using The Cancer Genome Atlas (TCGA) whole-exome sequencing of lung adenocarcinoma samples. We also demonstrate its application to a clinical sequencing platform based on a targeted gene panel.

© The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

Figures

Figure 1.
Figure 1.
Joint segmentation identifies copy number neutral loss-of-heterozygosity (LOH) event. Top panel shows copy number log-ratio of total sequence read count in the tumor to that in the normal along genomic positions on chromosome 6 from a whole-exome sequencing of a lung cancer patient sample. Second panel shows the allelic log-odds-ratio of the variant allele read counts in the tumor/normal pair revealing a copy-neutral LOH event on 6p.
Figure 2.
Figure 2.
Integrated visualization of FACETS analysis of whole-exome sequencing data from a TCGA chromophobe renal cell carcinoma sample (TCGA-KL-8331). The top panel displays total copy number log-ratio (logR), and the second panel displays allele-specific log-odds-ratio data (logOR) with chromosomes alternating in blue and gray. The third panel plots the corresponding integer (total, minor) copy number calls. The overall tumor ploidy is estimated to be 1.6, revealing a hypodiploid tumor genome due to the whole-chromosomal losses of multiple chromosomes. The tumor sample purity is estimated to be 0.89. The estimated cellular fraction (cf) profile is plotted at the bottom, revealing both clonal and subclonal copy number events.
Figure 3.
Figure 3.
Pre-processing and joint segmentation. (A) Parsing reference and variant allele count for SNP sites from tumor-nomal sequencing BAM files. All SNP sites contribute to total copy log-ratio (logR), and heterozygous sites contribute to allelic logOR. (B) Interval-sampling to reduce local serial dependencies in SNP-dense regions. (C) Joint segmentation logR and logOR and the detection of copy number aberrant regions of the genome. (D) Segment clustering to form groups with the same latent copy number states.
Figure 4.
Figure 4.
Joint analysis of total and allelic copy number pattern to more accurately estimate tumor purity, ploidy and the precise genotypes of the copy number alterations. Two examples (A and B) are presented here to illustrate the use of allelically balanced segments (logR close to zero) to determine the 2-copy state (purple line) and location shift λ in total copy number log-ratio (logR) due to aneuploidy of the tumor. (C) The expected value of logR and logOR as a function of total and minor copy number and cellular fraction Φ are plotted to show the degree of separability among different copy number genotype and cellular fraction. Each line traces the cellular fraction from low (0.1) at the original point close to (0.0) to high (0.9) on the other end of the line. Triangles mark the cellular fraction of 0.5 on each line. The colors represent the minor copy number: 0 is black, 1 is red, 2 is green and 3 is blue. Line types change by total copy number.
Figure 5.
Figure 5.
Kernel density plot of estimated cellular fraction reveals clonal and subclonal events.
Figure 6.
Figure 6.
FACETS analysis of whole-exome sequencing of 286 TCGA lung adenocarcinoma samples. (A) total number of segments per sample from standard CBS segmentation of total copy number versus FACETS joint segmentation of total and allele-specific copy ratios. (B) Proportion of concordantly detected segments between two methods. (C) Comparing FACETS and ABSOLUTE tumor purity estimates. (D) Comapring FACETS and ABSOLUTE ploidy estimates. (E) Bubble plot of FACETS and ABSOLUTE integer copy number calls. The number of concordant (diagonal) and discordant (off diagonal) alterations called are indicated inside each bubble.
Figure 7.
Figure 7.
FACETS analysis of a lung squamous cell carcinoma from MSKCC profiled by MSK-IMPACT targeted cancer gene panel sequencing revealed several putative oncogenic drivers and druggable targets. Tumor purity-, ploidy-corrected FACETS analysis provides more accurate integer copy number calls for the driver genes. Integer copy number above 10 are plotted in log10 scale.

References

    1. Sun W., Wright F.A., Tang Z., Nordgard S.H., Van Loo P., Yu T., Kristensen V.N., Perou C.M. Integrated study of copy number states and genotype calls using high-density SNP arrays. Nucleic Acids Res. 2009;37:5365–5377.
    1. Van Loo P., Nordgard S.H., Lingjærde O.C., Russnes H.G., Rye I.H., Sun W., Weigman V.J., Marynen P., Zetterberg A., Naume B., et al. Allele-specific copy number analysis of tumors. Proc. Natl. Acad. Sci. U.S.A. 2010;107:16910–16915.
    1. Yau C., Mouradov D., Jorissen R.N., Colella S., Mirza G., Steers G., Harris A., Ragoussis J., Sieber O., Holmes C.C., et al. A statistical approach for detecting genomic aberrations in heterogeneous tumor samples from single nucleotide polymorphism genotyping data. Genome Biol. 2010;11:R92.
    1. Rasmussen M., Sundstrom M., Goransson Kultima H., Botling J., Micke P., Birgisson H., Glimelius B., Isaksson A. Allele-specific copy number analysis of tumor samples with aneuploidy and tumor heterogeneity. Genome Biol. 2011;12:R108.
    1. Mayrhofer M., DiLorenzo S., Isaksson A. Patchwork: allele-specific copy number analysis of whole-genome sequenced tumor tissue. Genome Biol. 2013;14:R24.
    1. Chen H., Bell J.M., Zavala N.A., Ji H.P., Zhang N.R. Allele-specific copy number profiling by next-generation DNA sequencing. Nucleic Acids Res. 2014;42:e23.
    1. Ha G., Roth A., Khattra J., Ho J., Yap D., Prentice L.M., Melnyk N., McPherson A., Bashashati A., Laks E., et al. TITAN: inference of copy number architectures in clonal cell populations from tumor whole-genome sequence data. Genome Res. 2014;24:1881–1893.
    1. Chen M., Gunel M., Zhao H. SomatiCA: identifying, characterizing and quantifying somatic copy number aberrations from cancer genome sequencing data. PloS One. 2013;8:e78143.
    1. Oesper L., Mahmoody A., Raphael B.J. THetA: inferring intra-tumor heterogeneity from high-throughput DNA sequencing data. Genome Biol. 2013;14:R80.
    1. Li Y., Xie X. Deconvolving tumor purity and ploidy by integrating copy number alterations and loss of heterozygosity. Bioinformatics. 2014;30:2121–2129.
    1. Davis C.F., Ricketts C.J., Wang M., Yang L., Cherniack A.D., Shen H., Buhay C., Kang H., Kim S.C., Fahey C.C., et al. The somatic genomic landscape of chromophobe renal cell carcinoma. Cancer Cell. 2014;26:319–330.
    1. Carter S.L., Cibulskis K., Helman E., McKenna A., Shen H., Zack T., Laird P.W., Onofrio R.C., Winckler W., Weir B.A., et al. Absolute quantification of somatic DNA alterations in human cancer. Nat. Biotechnol. 2012;30:413–421.
    1. Cheng D.T., Mitchell T.N., Zehir A., Shah R.H., Benayed R., Syed A., Chandramohan R., Liu Z.Y., Won H.H., Scott S.N., et al. Memorial Sloan Kettering-Integrated Mutation Profiling of Actionable Cancer Targets (MSK-IMPACT): a hybridization capture-based next-generation sequencing clinical assay for solid tumor molecular oncology. J. Mol. Diagn. 2015;17:251–264.
    1. Degner J.F., Marioni J.C., Pai A.A., Pickrell J.K., Nkadori E., Gilad Y., Pritchard J.K. Effect of read-mapping biases on detecting allele-specific expression from RNA-sequencing data. Bioinformatics. 2009;25:3207–3212.
    1. Xi R., Luquette J., Hadjipanayis A., Kim T.-M., Park P.J. BIC-seq: a fast algorithm for detection of copy number alterations based on high-throughput sequencing data. Genome Biol. 2010;11(Suppl 1):O10.
    1. Sathirapongsasuti J.F., Lee H., Horst B.A., Brunner G., Cochran A.J., Binder S., Quackenbush J., Nelson S.F. Exome sequencing-based copy-number variation and loss of heterozygosity detection: ExomeCNV. Bioinformatics. 2011;27:2648–2654.
    1. Olshen A., Venkatraman E., Lucito R., Wigler M. Circular binary segmentation for the analysis of array-based DNA copy number data. Biostatistics. 2004;5:657–672.
    1. Venkatraman E., Olshen A. A faster circular binary segmentation algorithm for the analysis of array CGH data. Bioinformatics. 2007;23:657–663.
    1. Zack T.I., Schumacher S.E., Carter S.L., Cherniack A.D., Saksena G., Tabak B., Lawrence M.S., Zhang C.-Z., Wala J., Mermel C.H., et al. Pan-cancer patterns of somatic copy number alteration. Nat. Genet. 2013;45:1134–1140.
    1. Paik P.K., Shen R., Won H., Rekhtman N., Wang L., Sima C.S., Arora A., Venkatraman S., Ladanyi M., Berger M.F., et al. Next generation sequencing of stage IV squamous cell lung cancers reveals an association of PI3K aberrations and evidence of clonal evolution in patients with brain metastases. Cancer Discov. 2015;5:610–621.
    1. Koboldt D.C., Miller R.D., Kwok P.-Y. Distribution of human SNPs and its effect on high-throughput genotyping. Hum. Mutat. 2006;27:249–254.

Source: PubMed

3
Subscribe