Comprehensive red blood cell and platelet antigen prediction from whole genome sequencing: proof of principle

William J Lane, Connie M Westhoff, Jon Michael Uy, Maria Aguad, Robin Smeland-Wagman, Richard M Kaufman, Heidi L Rehm, Robert C Green, Leslie E Silberstein, MedSeq Project, David W Bates, Alexis D Carere, Allison Cirino, Kurt D Christensen, Robert C Green, Carolyn Y Ho, Lily Hoffman-Andrews, Joel B Krier, William J Lane, Denise M Perry, Lisa Lehmann, Calum A MacRae, Cynthia C Morton, Christine E Seidman, Shamil Sunyaev, Jason L Vassy, Rebecca Walsh, Sandy Aronson, Ozge Ceyhan-Birsoy, Siva Gowrisankar, Matthew S Lebo, Ignat Leschiner, Kalotina Machini, Heather M McLaughlin, Danielle R Azzariti, Heidi L Rehm, Jennifer Blumenthal-Barby, Lindsay Zausmer Feuerman, Leila Jamal, Kaitlyn Lee, Amy L McGuire, Jill Oliver Robinson, Melody J Slashinski, Julia Wycliff, Philip Lupo, Stewart C Alexander, Shubhangi Arora, Kelly Davis, Christine Kirby, Peter A Ubel, Peter Kraft, J Scott Roberts, Judy E Garber, Dmitry Dukhovny, Tina Hambuch, Michael F Murray, Isaac Kohane, Sek Won Kong, William J Lane, Connie M Westhoff, Jon Michael Uy, Maria Aguad, Robin Smeland-Wagman, Richard M Kaufman, Heidi L Rehm, Robert C Green, Leslie E Silberstein, MedSeq Project, David W Bates, Alexis D Carere, Allison Cirino, Kurt D Christensen, Robert C Green, Carolyn Y Ho, Lily Hoffman-Andrews, Joel B Krier, William J Lane, Denise M Perry, Lisa Lehmann, Calum A MacRae, Cynthia C Morton, Christine E Seidman, Shamil Sunyaev, Jason L Vassy, Rebecca Walsh, Sandy Aronson, Ozge Ceyhan-Birsoy, Siva Gowrisankar, Matthew S Lebo, Ignat Leschiner, Kalotina Machini, Heather M McLaughlin, Danielle R Azzariti, Heidi L Rehm, Jennifer Blumenthal-Barby, Lindsay Zausmer Feuerman, Leila Jamal, Kaitlyn Lee, Amy L McGuire, Jill Oliver Robinson, Melody J Slashinski, Julia Wycliff, Philip Lupo, Stewart C Alexander, Shubhangi Arora, Kelly Davis, Christine Kirby, Peter A Ubel, Peter Kraft, J Scott Roberts, Judy E Garber, Dmitry Dukhovny, Tina Hambuch, Michael F Murray, Isaac Kohane, Sek Won Kong

Abstract

Background: There are 346 serologically defined red blood cell (RBC) antigens and 33 serologically defined platelet (PLT) antigens, most of which have known genetic changes in 45 RBC or six PLT genes that correlate with antigen expression. Polymorphic sites associated with antigen expression in the primary literature and reference databases are annotated according to nucleotide positions in cDNA. This makes antigen prediction from next-generation sequencing data challenging, since it uses genomic coordinates.

Study design and methods: The conventional cDNA reference sequences for all known RBC and PLT genes that correlate with antigen expression were aligned to the human reference genome. The alignments allowed conversion of conventional cDNA nucleotide positions to the corresponding genomic coordinates. RBC and PLT antigen prediction was then performed using the human reference genome and whole genome sequencing (WGS) data with serologic confirmation.

Results: Some major differences and alignment issues were found when attempting to convert the conventional cDNA to human reference genome sequences for the following genes: ABO, A4GALT, RHD, RHCE, FUT3, ACKR1 (previously DARC), ACHE, FUT2, CR1, GCNT2, and RHAG. However, it was possible to create usable alignments, which facilitated the prediction of all RBC and PLT antigens with a known molecular basis from WGS data. Traditional serologic typing for 18 RBC antigens were in agreement with the WGS-based antigen predictions, providing proof of principle for this approach.

Conclusion: Detailed mapping of conventional cDNA annotated RBC and PLT alleles can enable accurate prediction of RBC and PLT antigens from whole genomic sequencing data.

© 2015 The Authors Transfusion published by Wiley Periodicals, Inc. on behalf of AABB.

Figures

Figure 1
Figure 1
Approach for mapping conventional cDNA reference sequence positions to genomic coordinates. (A) Process developed to convert conventional CDS positions to genomic coordinates with FY as an example. (B) CDS positions referenced to cDNA sequence. (C and D) Genome transcript and genomic coordinates according to the human reference genome. (E) UCSC genomic sequence in which each exon and intron is annotated as separate sequence entry preceded by the genomic coordinates. The sequence regions are colored: 3′ and 5′ (gray), CDS (green uppercase), and intron (blue lowercase). (F) FY gene conversion between genomic coordinates and cDNA reference sequence.
Figure 2
Figure 2
WGS‐based RBC and PLT gene sequencing. Circos plot61of the WGS data that has been filtered to only show the RBC and PLT genes with a circular plot of the sequence coverage (100‐bp bins).

References

    1. Storry JR, Castilho L, Daniels G, et al. International Society of Blood Transfusion Working Party on red cell immunogenetics and blood group terminology: Cancun report (2012). Vox Sang 2014;107:90‐6.
    1. International Society of Blood Transfusion (ISBT). Red cell immunogenetics and blood group terminology [Internet]. Amsterdam: ISBT Central Office; 2015 [cited 2015 Jul 1]. Available from: .
    1. Reid ME, Lomas‐Francis C, Olsson ML. The blood group antigen factsbook. 3rd ed. San Diego: Academic Press; 2013.
    1. Daniels G. Human blood groups. 3rd ed Oxford: Wiley‐Blackwell; 2013.
    1. Ballif BA, Helias V, Peyrard T, et al. Disruption of SMIM1 causes the Vel‐ blood type. EMBO Mol Med 2013;5:751‐61.
    1. Storry JR, Jöud M, Christophersen MK, et al. Homozygosity for a null allele of SMIM1 defines the Vel‐negative blood group phenotype. Nat Genet 2013;45:537‐41.
    1. Cvejic A, Haer‐Wigman L, Stephens JC, et al. SMIM1 underlies the Vel blood group and influences red blood cell traits. Nat Genet 2013;45:542‐5.
    1. Daniels G, Ballif BA, Helias V, et al. Lack of the nucleoside transporter ENT1 results in the Augustine‐null blood type and ectopic mineralization. Blood 2015;125:3651‐4.
    1. Anliker M, von Zabern I, Höchsmann B, et al. A new blood group antigen is defined by anti‐CD59, detected in a CD59‐deficient patient. Transfusion 2014;54:1817‐22.
    1. Immuno polymorphism database: IPD‐HPA [Internet]. Cambridgeshire: EMBL‐EBI; 2015 [cited 2015 May 15]. Available from: .
    1. Robinson J, Halliwell JA, McWilliam H, et al. IPD—the Immuno Polymorphism Database. Nucleic Acids Res 2013;41:D1234‐40.
    1. Metcalfe P, Watkins NA, Ouwehand WH, et al. Nomenclature of human platelet antigens. Vox Sang 2003;85:240‐5.
    1. Patnaik SK, Helmberg W, Blumenfeld OO. BGMUT: NCBI dbRBC database of allelic variations of genes encoding antigens of blood group systems. Nucleic Acids Res 2012;40:D1023‐9.
    1. Wagner FF. RhesusBase [Internet]. Springe: DRK Blutspendedienst NSTOB; 2015 [cited 2015 Jun 22]. Available from: .
    1. Hashmi G, Shariff T, Zhang Y, et al. Determination of 24 minor red blood cell antigens for more than 2000 blood donors by high‐throughput DNA analysis. Transfusion 2007;47:736‐47.
    1. Liu Z, Liu M, Mercado T, et al. Extended blood group molecular typing and next‐generation sequencing. Transfus Med Rev 2014;28:177‐86.
    1. Chou ST, Westhoff CM. The role of molecular immunohematology in sickle cell disease. Transfus Apher Sci 2011;44:73‐9.
    1. Avent ND, Martinez A, Flegel WA, et al. The Bloodgen Project of the European Union, 2003‐2009. Transfus Med Hemother 2009;36:162‐7.
    1. Storry JR, Olsson ML. The ABO blood group system revisited: a review and update. Immunohematology 2009;25:48‐59.
    1. Svensson L, Rydberg L, de Mattos LC, et al. Blood group A(1) and A(2) revisited: an immunochemical analysis. Vox Sang 2009;96:56‐61.
    1. Liu Y, Fujitani N, Koda Y, et al. Presence of H type 3/4 chains of ABO histo‐blood group system in serous cells of human submandibular gland and regulation of their expression by the secretor gene (FUT2). J Histochem Cytochem 1999;47:889‐94.
    1. Lofling JC, Hauzenberger E, Holgersson J. Absorption of anti‐blood group A antibodies on P‐selectin glycoprotein ligand‐1/immunoglobulin chimeras carrying blood group A determinants: core saccharide chain specificity of the Se and H gene encoded alpha1,2 fucosyltransferases in different host cells. Glycobiology 2002;12:173‐82.
    1. Robinson J, Mistry K, McWilliam H, et al. IPD—the Immuno Polymorphism Database. Nucleic Acids Res 2010;38:D863‐9.
    1. Arinsburg SA, Shaz BH, Westhoff C, et al. Determination of human platelet antigen typing by molecular methods: importance in diagnosis and early treatment of neonatal alloimmune thrombocytopenia. Am J Hematol 2012;87:525‐8.
    1. Lind C, Ferriola D, Mackiewicz K, et al. Next‐generation sequencing: the solution for high‐resolution, unambiguous human leukocyte antigen typing. Hum Immunol 2010;71:1033‐42.
    1. Gabriel C, Furst D, Fae I, et al. HLA typing by next‐generation sequencing—getting closer to reality. Tissue Antigens 2014;83:65‐75.
    1. Shiina T, Suzuki S, Ozaki Y, et al. Super high resolution for single molecule‐sequence‐based typing of classical HLA loci at the 8‐digit level using next generation sequencers. Tissue Antigens 2012;80:305‐16.
    1. Wang C, Krishnakumar S, Wilhelmy J, et al. High‐throughput, high‐fidelity HLA genotyping with deep sequencing. Proc Natl Acad Sci U S A 2012;109:8676‐81.
    1. Erlich RL, Jia X, Anderson S, et al. Next‐generation sequencing for HLA typing of class I loci. BMC Genomics 2011;12:42.
    1. Bentley G, Higuchi R, Hoglund B, et al. High‐resolution, high‐throughput HLA genotyping by next‐generation sequencing. Tissue Antigens 2009;74:393‐403.
    1. Chu HT, Lin H, Tsao TT, et al. Genotyping of human neutrophil antigens (HNA) from whole genome sequencing data. BMC Med Genomics 2013;6:31.
    1. Stabentheiner S, Danzer M, Niklas N, et al. Overcoming methodical limits of standard RHD genotyping by next‐generation sequencing. Vox Sang 2011;100:381‐8.
    1. Rieneck K, Bak M, Jonson L, et al. Next‐generation sequencing: proof of concept for antenatal prediction of the fetal Kell blood group phenotype from cell‐free fetal DNA in maternal plasma. Transfusion 2013;53:2892‐8.
    1. Fichou Y, Audrézet MP, Guéguen P, et al. Next‐generation sequencing is a credible strategy for blood group genotyping. Br J Haematol 2014;167:554‐62.
    1. Giollo M, Minervini G, Scalzotto M, et al. BOOGIE: predicting blood groups from high throughput sequencing data. PLoS One 2015;10:e0124579.
    1. Ball MP, Thakuria JV, Zaranek AW, et al. A public resource facilitating clinical use of genomes. Proc Natl Acad Sci U S A 2012;109:11920‐7.
    1. Church GM. The personal genome project. Mol Syst Biol 2005;1:2005 0030.
    1. Sievers F, Wilm A, Dineen D, et al. Fast, scalable generation of high‐quality protein multiple sequence alignments using Clustal Omega. Mol Syst Biol 2011;7:539.
    1. Thorvaldsdottir H, Robinson JT, Mesirov JP. Integrative Genomics Viewer (IGV): high‐performance genomics data visualization and exploration. Brief Bioinform 2013;14:178‐92.
    1. Vassy JL, Lautenbach DM, McLaughlin HM, et al. The MedSeq Project: a randomized trial of integrating whole genome sequencing into clinical medicine. Trials 2014;15:85.
    1. Bentley DR, Balasubramanian S, Swerdlow HP, et al. Accurate whole human genome sequencing using reversible terminator chemistry. Nature 2008;456:53‐9.
    1. Li H, Durbin R. Fast and accurate long‐read alignment with Burrows‐Wheeler transform. Bioinformatics 2010;26:589‐95.
    1. McKenna A, Hanna M, Banks E, et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next‐generation DNA sequencing data. Genome Res 2010;20:1297‐303.
    1. Quinlan AR, Hall IM. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 2010;26:841‐2.
    1. Thuresson B, Westman JS, Olsson ML. Identification of a novel A4GALT exon reveals the genetic basis of the P1/P2 histo‐blood groups. Blood 2011;117:678‐87.
    1. Wagner FF, Ladewig B, Angert KS, et al. The DAU allele cluster of the RHD gene. Blood 2002;100:306‐11.
    1. Chou ST, Jackson T, Vege S, et al. High prevalence of red blood cell alloimmunization in sickle cell disease despite transfusion from Rh‐matched minority donors. Blood 2013;122:1062‐71.
    1. Westhoff CM, Vege S, Horn T, et al. RHCE*ceMO is frequently in cis to RHD*DAU0 and encodes a hr(S) ‐, hr(B) ‐, RH:‐61 phenotype in black persons: clinical significance. Transfusion 2013;53:2983‐9.
    1. Westhoff CM, Silberstein LE, Wylie DE, et al. 16Cys encoded by the RHce gene is associated with altered expression of the e antigen and is frequent in the R0 haplotype. Br J Haematol 2001;113:666‐71.
    1. Chaudhuri A, Polyakova J, Zbrzezna V, et al. Cloning of glycoprotein D cDNA, which encodes the major subunit of the Duffy blood group system and the receptor for the Plasmodium vivax malaria parasite. Proc Natl Acad Sci U S A 1993;90:10793‐7.
    1. Iwamoto S, Li J, Omi T, et al. Identification of a novel exon and spliced form of Duffy mRNA that is the predominant transcript in both erythroid and postcapillary venule endothelium. Blood 1996;87:378‐85.
    1. Iwamoto S, Omi T, Kajii E, et al. Genomic organization of the glycoprotein D gene: Duffy blood group Fya/Fyb alloantigen system is associated with a polymorphism at the 44‐amino acid residue. Blood 1995;85:622‐6.
    1. Elmgren A. Significance of individual point mutations, T202C and C314T, in the human Lewis (FUT3) gene for expression of Lewis antigens by the human alpha (1,3/1,4)‐fucosyltransferase, Fuc‐TIII. J Biol Chem 1997;272:21994‐8.
    1. Li Y, Camp S, Rachinsky TL, et al. Gene structure of mammalian acetylcholinesterase. Alternative exons dictate tissue‐specific expression. J Biol Chem 1991;266:23083‐90.
    1. Bartels CF, Zelinski T, Lockridge O. Mutation at codon 322 in the human acetylcholinesterase (ACHE) gene accounts for YT blood group polymorphism. Am J Hum Genet 1993;52:928‐36.
    1. Kelly RJ, Rouquier S, Giorgi D, et al. Sequence expression of a candidate for the human Secretor blood group alpha(1,2)fucosyltransferase gene (FUT2). Homozygosity for an enzyme‐inactivating nonsense mutation commonly correlates with the non‐secretor phenotype. J Biol Chem 1995;270:4640‐9.
    1. Moulds JM, Zimmerman PA, Doumbo OK, et al. Expansion of the Knops blood group system and subdivision of Sl(a). Transfusion 2002;42:251‐6.
    1. NHLBI GO Exome Sequencing Project (ESP) Exome Variant Server [Internet]. Seattle (WA): University of Washington; 2015 [cited 2015 Jul 15]. Available from: .
    1. Inaba N, Hiruma T, Togayachi A, et al. A novel I‐branching beta‐1,6‐N‐acetylglucosaminyltransferase involved in human blood group I antigen expression. Blood 2003;101:2870‐6.
    1. Ridgwell K, Spurr NK, Laguda B, et al. Isolation of cDNA clones for a 50 kDa glycoprotein of the human erythrocyte membrane associated with Rh (rhesus) blood‐group antigen expression. Biochem J 1992;287:223‐8.
    1. Krzywinski M, Schein J, Birol I, et al. Circos: an information aesthetic for comparative genomics. Genome Res 2009;19:1639‐45.
    1. Drier Y, Lawrence MS, Carter SL, et al. Somatic rearrangements across cancer reveal classes of samples with distinct patterns of DNA breakage and rearrangement‐induced hypermutability. Genome Res 2013;23:228‐35.

Source: PubMed

3
Sottoscrivi