Whole-Genome Sequencing in Severe Chronic Obstructive Pulmonary Disease

Dmitry Prokopenko, Phuwanat Sakornsakolpat, Heide Loehlein Fier, Dandi Qiao, Margaret M Parker, Merry-Lynn N McDonald, Ani Manichaikul, Stephen S Rich, R Graham Barr, Christopher J Williams, Mark L Brantly, Christoph Lange, Terri H Beaty, James D Crapo, Edwin K Silverman, Michael H Cho, Dmitry Prokopenko, Phuwanat Sakornsakolpat, Heide Loehlein Fier, Dandi Qiao, Margaret M Parker, Merry-Lynn N McDonald, Ani Manichaikul, Stephen S Rich, R Graham Barr, Christopher J Williams, Mark L Brantly, Christoph Lange, Terri H Beaty, James D Crapo, Edwin K Silverman, Michael H Cho

Abstract

Genome-wide association studies have identified common variants associated with chronic obstructive pulmonary disease (COPD). Whole-genome sequencing (WGS) offers comprehensive coverage of the entire genome, as compared with genotyping arrays or exome sequencing. We hypothesized that WGS in subjects with severe COPD and smoking control subjects with normal pulmonary function would allow us to identify novel genetic determinants of COPD. We sequenced 821 patients with severe COPD and 973 control subjects from the COPDGene and Boston Early-Onset COPD studies, including both non-Hispanic white and African American individuals. We performed single-variant and grouped-variant analyses, and in addition, we assessed the overlap of variants between sequencing- and array-based imputation. Our most significantly associated variant was in a known region near HHIP (combined P = 1.6 × 10-9); additional variants approaching genome-wide significance included previously described regions in CHRNA5, TNS1, and SERPINA6/SERPINA1 (the latter in African American individuals). None of our associations were clearly driven by rare variants, and we found minimal evidence of replication of genes identified by previously reported smaller sequencing studies. With WGS, we identified more than 20 million new variants, not seen with imputation, including more than 10,000 of potential importance in previously identified COPD genome-wide association study regions. WGS in severe COPD identifies a large number of potentially important functional variants, with the strongest associations being in known COPD risk loci, including HHIP and SERPINA1. Larger sample sizes will be needed to identify associated variants in novel regions of the genome.

Trial registration: ClinicalTrials.gov NCT00608764.

Keywords: association studies; chronic obstructive pulmonary disease; whole-genome sequencing.

Figures

Figure 1.
Figure 1.
Manhattan plot for combined single-variant analysis of 1,794 individuals done with Firth logistic regression for case/control status, as produced by Efficient and Parallelizable Association Container Toolbox (EPACTS). Red line indicates genome-wide significance level.
Figure 2.
Figure 2.
Bar plots of variant loadings in WGS and HRC-imputed data for (A) non-Hispanic white individuals and (B) African American individuals. The quality of the overlap was measured on the basis of an NRD cutoff of 5%. Common variants correspond to variants with MAF greater than 5%, low-frequency variants with MAF between 1% and 5%, and rare variants with MAF less than or equal to 1%. HRC = Haplotype Reference Consortium; MAF = minor allele frequency; NRD = nonreference discordance; WGS = whole-genome sequencing.
Figure 3.
Figure 3.
TreeMap for a list of variants in 22 previously described genome-wide association study regions in some measure of linkage disequilibrium with the lead SNP using D′ ≥ 0.8 in the same subset of individuals with sequenced and imputed genotypes for (A) non-Hispanic white individuals and (B) African American individuals. D = a subset of variants in category PF AND being within one of the six most “damaging” consequence categories as per ANNOVAR (see data supplement for definition); HRC = variants found with imputed genotypes; PF = a subset of variants having an annotation suggesting some possible functional role (see data supplement for definition); WGS = variants found with whole-genome sequencing.

Source: PubMed

3
S'abonner