Genomic characterization of a newly discovered coronavirus associated with acute respiratory distress syndrome in humans

Sander van Boheemen, Miranda de Graaf, Chris Lauber, Theo M Bestebroer, V Stalin Raj, Ali Moh Zaki, Albert D M E Osterhaus, Bart L Haagmans, Alexander E Gorbalenya, Eric J Snijder, Ron A M Fouchier, Sander van Boheemen, Miranda de Graaf, Chris Lauber, Theo M Bestebroer, V Stalin Raj, Ali Moh Zaki, Albert D M E Osterhaus, Bart L Haagmans, Alexander E Gorbalenya, Eric J Snijder, Ron A M Fouchier

Abstract

A novel human coronavirus (HCoV-EMC/2012) was isolated from a man with acute pneumonia and renal failure in June 2012. This report describes the complete genome sequence, genome organization, and expression strategy of HCoV-EMC/2012 and its relation with known coronaviruses. The genome contains 30,119 nucleotides and contains at least 10 predicted open reading frames, 9 of which are predicted to be expressed from a nested set of seven subgenomic mRNAs. Phylogenetic analysis of the replicase gene of coronaviruses with completely sequenced genomes showed that HCoV-EMC/2012 is most closely related to Tylonycteris bat coronavirus HKU4 (BtCoV-HKU4) and Pipistrellus bat coronavirus HKU5 (BtCoV-HKU5), which prototype two species in lineage C of the genus Betacoronavirus. In accordance with the guidelines of the International Committee on Taxonomy of Viruses, and in view of the 75% and 77% amino acid sequence identity in 7 conserved replicase domains with BtCoV-HKU4 and BtCoV-HKU5, respectively, we propose that HCoV-EMC/2012 prototypes a novel species in the genus Betacoronavirus. HCoV-EMC/2012 may be most closely related to a coronavirus detected in Pipistrellus pipistrellus in The Netherlands, but because only a short sequence from the most conserved part of the RNA-dependent RNA polymerase-encoding region of the genome was reported for this bat virus, its genetic distance from HCoV-EMC remains uncertain. HCoV-EMC/2012 is the sixth coronavirus known to infect humans and the first human virus within betacoronavirus lineage C.

Importance: Coronaviruses are capable of infecting humans and many animal species. Most infections caused by human coronaviruses are relatively mild. However, the outbreak of severe acute respiratory syndrome (SARS) caused by SARS-CoV in 2002 to 2003 and the fatal infection of a human by HCoV-EMC/2012 in 2012 show that coronaviruses are able to cause severe, sometimes fatal disease in humans. We have determined the complete genome of HCoV-EMC/2012 using an unbiased virus discovery approach involving next-generation sequencing techniques, which enabled subsequent state-of-the-art bioinformatics, phylogenetics, and taxonomic analyses. By establishing its complete genome sequence, HCoV-EMC/2012 was characterized as a new genotype which is closely related to bat coronaviruses that are distant from SARS-CoV. We expect that this information will be vital to rapid advancement of both clinical and vital research on this emerging pathogen.

Figures

FIG 1
FIG 1
Genome organization and expression of HCoV-EMC/2012. (A) The coding part of the genome and terminal untranslated regions are depicted, respectively, by a gray background and horizontal lines. Rectangles indicate ORFs and their locations in three reading frames. The dashed lines in ORF1a and ORF5 indicate base ambiguities observed during sequencing. Triangles represent sites in the replicase polyproteins pp1a and pp1ab that are predicted to be cleaved by papain-like proteinases (gray) or the 3C-like cysteine proteinase (black). Cleavage products are numbered nsp1 to nsp16, according to the convention established for other coronaviruses (23). The −1 ribosomal frameshift site (RFS) in the ORF1a/ORF1b overlap region is indicated. The location of the leader TRS (transcription-regulatory sequences) (L) and seven body TRSs (numbered) are highlighted by black dots. All coordinates correspond to the scale shown at the bottom. (B) Sequence comparison of leader TRS region and seven body TRSs. The fully conserved TRS core sequence AACGAA is highlighted. Nucleotides in the body TRSs are written in uppercase letters if the complementary nucleotide can base pair with the corresponding residue in the leader TRS region (including G-U base pairs). TRS starting coordinates in the HCoV-EMC/2012 genome are shown at the left; for the body TRSs, the numbers of (potential) base pairs with the leader TRS region are shown at the right.
FIG 2
FIG 2
Phylogenetic trees for HCoV-EMC/2012 and selected other coronaviruses. Unrooted maximum likelihood phylogenies inferred from the nucleotide sequences of full-length ORF1ab (A) or a 332-nt fragment from the RdRp-encoding domain of ORF1b (B) are shown. HCoV-EMC/2012 and 20 viruses representing the recognized species diversity of coronaviruses were included, with bat-derived isolate VM314/2008 also included in the analysis presented in panel B (31). The viruses and corresponding species used are Alphacoronavirus 1 (Alpha-CoV1), Human coronavirus 229E (HCoV-229E), Human coronavirus NL63 (HCoV-NL63), Miniopterus bat coronavirus 1 (BtCoV-1AB), Miniopterus bat coronavirus HKU8 (BtCoV-HKU8), Porcine epidemic diarrhea virus (PED), Rhinolophus bat coronavirus HKU2 (BtCoV-HKU2), Scotophilus bat coronavirus 512 (BtCoV-512), Betacoronavirus 1 (Beta-CoV1), Human coronavirus HKU1 (HCoV-HKU1), Murine coronavirus (MHV), Tylonycteris bat coronavirus HKU4 (BtCoV-HKU4), Pipistrellus bat coronavirus HKU5 (BtCoV-HKU5), Rousettus bat coronavirus HKU9 (BtCoV-HKU9), Severe acute respiratory syndrome-related coronavirus (SARS-CoV), Avian coronavirus (IBV), Beluga whale coronavirus SW1 (BWCoV-SW1), Bulbul coronavirus HKU11 (ACoV-HKU11), Thrush coronavirus HKU12 (ACoV-HKU12), and Munia coronavirus HKU13 (ACoV-HKU13). Bootstrap values above 50 are shown. Arcs and symbols indicate the four coronavirus genera. The scale bar represents the number of nucleotide substitutions per site.
FIG 3
FIG 3
Phylogenetic trees for HCoV-EMC/2012 and selected other coronaviruses. Unrooted maximum likelihood phylogenies based on coronavirus-wide conserved protein domains in replicase pp1ab (A) or on the conserved parts of structural proteins S2, E, M, and N (B) for HCoV-EMC/2012 and 20 viruses representing the recognized species diversity of coronaviruses are shown (see Fig. 2 legend for names and abbreviations). Branch support values are based on the Shimodaira-Hasegawa-like procedure and are in the range of zero to one; only nonoptimal values smaller than one are shown. Arcs and symbols indicate the four coronavirus genera. The scale bars represent average numbers of substitutions per amino acid position.

References

    1. de Groot RJ, et al. 2012. Family Coronaviridae, p. 806–828 In King AMQ, Adams MJ, Cartens EB, Lefkowitz EJ, Virus taxonomy, the 9th report of the international committee on taxonomy of viruses. Academic Press, San Diego, CA.
    1. Perlman S, Netland J. 2009. Coronaviruses post-SARS: update on replication and pathogenesis. Nat. Rev. Microbiol. 7:439–450
    1. Gloza-Rausch F, et al. 2008. Detection and prevalence patterns of group I coronaviruses in bats, northern Germany. Emerg. Infect. Dis. 14:626–631
    1. Lau SK, et al. 2005. Severe acute respiratory syndrome coronavirus-like virus in Chinese horseshoe bats. Proc. Natl. Acad. Sci. U. S. A. 102:14040–14045
    1. Li W, et al. 2005. Bats are natural reservoirs of SARS-like coronaviruses. Science 310:676–679
    1. Pfefferle S, et al. 2009. Distant relatives of severe acute respiratory syndrome coronavirus and close relatives of human coronavirus 229E in bats, Ghana. Emerg. Infect. Dis. 15:1377–1384
    1. Vijaykrishna D, et al. 2007. Evolutionary insights into the ecology of coronaviruses. J. Virol. 81:4012–4020
    1. Woo PC, et al. 2007. Comparative analysis of twelve genomes of three novel group 2c and group 2d coronaviruses reveals unique group and subgroup features. J. Virol. 81:1574–1585
    1. Hamre D, Procknow JJ. 1966. A new virus isolated from the human respiratory tract. Proc. Soc. Exp. Biol. Med. 121:190–193
    1. McIntosh K, Dees JH, Becker WB, Kapikian AZ, Chanock RM. 1967. Recovery in tracheal organ cultures of novel viruses from patients with respiratory disease. Proc. Natl. Acad. Sci. U. S. A. 57:933–940
    1. Drosten C, et al. 2003. Identification of a novel coronavirus in patients with severe acute respiratory syndrome. N Engl J. Med. 348:1967–1976
    1. Marra MA, et al. 2003. The genome sequence of the SARS-associated coronavirus. Science 300:1399–1404
    1. Peiris JS, Guan Y, Yuen KY. 2004. Severe acute respiratory syndrome. Nat. Med. 10:S88–S97
    1. Rota PA, et al. 2003. Characterization of a novel coronavirus associated with severe acute respiratory syndrome. Science 300:1394–1399
    1. Fouchier RA, et al. 2004. A previously undescribed coronavirus associated with respiratory disease in humans. Proc. Natl. Acad. Sci. U. S. A. 101:6212–6216
    1. van der Hoek L, et al. 2004. Identification of a new human coronavirus. Nat. Med. 10:368–373
    1. Woo PC, et al. 2005. Characterization and complete genome sequence of a novel coronavirus, coronavirus HKU1, from patients with pneumonia. J. Virol. 79:884–895
    1. Zlateva KT, et al. 2012. No novel coronaviruses identified in a large collection of human nasopharyngeal specimens using family-wide CODEHOP-based primers. Arch. Virol., in press
    1. Gorbalenya AE, Enjuanes L, Ziebuhr J, Snijder EJ. 2006. Nidovirales: evolving the largest RNA virus genome. Virus Res. 117:17–37
    1. Adams MJ, Carstens EB. 2012. Ratification vote on taxonomic proposals to the International Committee on Taxonomy of Viruses (2012). Arch. Virol. 157:1411–1422
    1. Lauber C, Gorbalenya AE. 2012. Partitioning the genetic diversity of a virus family: approach and evaluation through a case study of picornaviruses. J. Virol. 86:3890–3904
    1. Masters PS. 2006. The molecular biology of coronaviruses. Adv. Virus Res. 66:193–292
    1. Snijder EJ, et al. 2003. Unique and conserved features of genome and proteome of SARS-coronavirus, an early split-off from the coronavirus group 2 lineage. J. Mol. Biol. 331:991–1004
    1. Zaki AM, et al. 2012. Isolation of a novel coronavirus from a man with pneumonia in Saudi Arabia. N Engl J. Med., in press.
    1. Bermingham A, et al. 2012. Severe respiratory illness caused by a novel coronavirus, in a patient transferred to the United Kingdom from the Middle East, September 2012. Euro Surveill. 17:pii=10290
    1. Ziebuhr J, Snijder EJ, Gorbalenya AE. 2000. Virus-encoded proteinases and proteolytic processing in the nidovirales. J. Gen. Virol. 81:853–879
    1. Pasternak AO, Spaan WJ, Snijder EJ. 2006. Nidovirus transcription: how to make sense…? J. Gen. Virol. 87:1403–1421
    1. Sawicki SG, Sawicki DL, Siddell SG. 2007. A contemporary view of coronavirus transcription. J. Virol. 81:20–29
    1. Sola I, Mateos-Gomez PA, Almazan F, Zuñiga S, Enjuanes L. 2011. RNA-RNA and RNA-protein interactions in coronavirus replication and transcription. RNA Biol. 8:237–248
    1. Firth AE, Brierley I. 2012. Non-canonical translation in RNA viruses. J. Gen. Virol. 93:1385–1409
    1. Reusken CB, et al. 2010. Circulation of group 2 coronaviruses in a bat species common to urban areas in Western Europe. Vector Borne Zoonotic Dis. 10:785–791
    1. Holmes KV, Lai MC. 1996. Coronaviridae: the viruses and their replication, p. 1075–1093 In Fields BN, Knipe P, Howley PM, Fields virology, 3rd ed. Lippincott-Raven, Philadelphia, PA.
    1. McIntosh K. 1996. Coronaviruses, p. 1095–1103 In Fields BN, Knipe P, Howley PM, Fields virology, 3rd ed. Lippincott-Raven, Philadelphia, PA.
    1. Keng CT, et al. 2011. SARS coronavirus 8b reduces viral replication by down-regulating E via an ubiquitin-independent proteasome pathway. Microbes Infect. 13:179–188
    1. Narayanan K, Huang C, Makino S. 2008. Coronavirus accessory proteins, p. 235–244 In Perlman S, Gallagher T, Snijder EJ, Nidoviruses. ASM Press, Washington, DC
    1. Senanayake SD, Brian DA. 1997. Bovine coronavirus I protein synthesis follows ribosomal scanning on the bicistronic N mRNA. Virus Res. 48:101–105
    1. Mazumder R, Iyer LM, Vasudevan S, Aravind L. 2002. Detection of novel members, structure-function analysis and evolutionary classification of the 2H phosphoesterase superfamily. Nucleic Acids Res. 30:5229–5243
    1. Snijder EJ, den Boon JA, Horzinek MC, Spaan WJ. 1991. Comparison of the genome organization of toro- and coronaviruses: evidence for two nonhomologous RNA recombination events during Berne virus evolution. Virology 180:448–452
    1. Yount B, et al. 2005. Severe acute respiratory syndrome coronavirus group-specific open reading frames encode nonessential functions for replication in cell cultures and mice. J. Virol. 79:14909–14922
    1. Cruz JL, et al. 2011. Coronavirus gene 7 counteracts host defenses and modulates virus virulence. PLoS Pathog. 7:e1002090.
    1. de Haan CA, Masters PS, Shen X, Weiss S, Rottier PJ. 2002. The group-specific murine coronavirus genes are not essential, but their deletion, by reverse genetics, is attenuating in the natural host. Virology 296:177–189
    1. Pewe L, et al. 2005. A severe acute respiratory syndrome-associated coronavirus-specific protein enhances virulence of an attenuated murine coronavirus. J. Virol. 79:11335–11342
    1. Frieman M, et al. 2007. Severe acute respiratory syndrome coronavirus ORF6 antagonizes STAT1 function by sequestering nuclear import factors on the rough endoplasmic reticulum/Golgi membrane. J. Virol. 81:9812–9824
    1. Oostra M, de Haan CA, Rottier PJ. 2007. The 29-nucleotide deletion present in human but not in animal severe acute respiratory syndrome coronaviruses disrupts the functional expression of open reading frame 8. J. Virol. 81:13876–13888
    1. Cornman VM, et al. 2012. Detection of a novel human coronavirus by real-time reverse-transcription polymerase chain reaction. Euro Surveill. 17:pii=20285
    1. Welsh J, McClelland M. 1990. Fingerprinting genomes using PCR with arbitrary primers. Nucleic Acids Res. 18:7213–7218
    1. Hall TA. 1999. BioEdit: a user-friendly biological sequence alignment Editor and analysis program for Windows 95/98/NT. Nucleic Acids Symp. Ser. 41:95–98
    1. Gorbalenya AE, et al. 2010. Practical application of bioinformatics by the multidisciplinary VIZIER consortium. Antiviral. Res. 87:95–110
    1. Castresana J. 2000. Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis. Mol. Biol. Evol. 17:540–552
    1. Antonov IV, Leontovich AM, Gorbalenya AE. 2008. BAGG—Blocks accepting gaps generator, version 1.0
    1. Darriba D, Taboada GL, Doallo R, Posada D. 2011. ProtTest 3: fast selection of best-fit models of protein evolution. Bioinformatics 27:1164–1165
    1. Guindon S, et al. 2010. New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. Syst. Biol. 59:307–321
    1. Katoh K, Misawa K, Kuma K, Miyata T. 2002. MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res. 30:3059–3066
    1. Posada D, Crandall KA. 1998. ModelTest: testing the model of DNA substitution. Bioinformatics 14:817–818

Source: PubMed

3
Suscribir