Probable Pangolin Origin of SARS-CoV-2 Associated with the COVID-19 Outbreak

Tao Zhang, Qunfu Wu, Zhigang Zhang, Tao Zhang, Qunfu Wu, Zhigang Zhang

Abstract

An outbreak of coronavirus disease 2019 (COVID-19) caused by the 2019 novel coronavirus (SARS-CoV-2) began in the city of Wuhan in China and has widely spread worldwide. Currently, it is vital to explore potential intermediate hosts of SARS-CoV-2 to control COVID-19 spread. Therefore, we reinvestigated published data from pangolin lung samples from which SARS-CoV-like CoVs were detected by Liu et al. [1]. We found genomic and evolutionary evidence of the occurrence of a SARS-CoV-2-like CoV (named Pangolin-CoV) in dead Malayan pangolins. Pangolin-CoV is 91.02% and 90.55% identical to SARS-CoV-2 and BatCoV RaTG13, respectively, at the whole-genome level. Aside from RaTG13, Pangolin-CoV is the most closely related CoV to SARS-CoV-2. The S1 protein of Pangolin-CoV is much more closely related to SARS-CoV-2 than to RaTG13. Five key amino acid residues involved in the interaction with human ACE2 are completely consistent between Pangolin-CoV and SARS-CoV-2, but four amino acid mutations are present in RaTG13. Both Pangolin-CoV and RaTG13 lost the putative furin recognition sequence motif at S1/S2 cleavage site that can be observed in the SARS-CoV-2. Conclusively, this study suggests that pangolin species are a natural reservoir of SARS-CoV-2-like CoVs.

Keywords: COVID-19; SARS-CoV-2; origin; pangolin.

Conflict of interest statement

Declaration of Interests The authors declare no competing interests.

Copyright © 2020 Elsevier Inc. All rights reserved.

Figures

Graphical abstract
Graphical abstract
Figure 1
Figure 1
Genome-Related Analysis (A) Sequence depth of reads remapped to Pangolin-CoV. (B) Similarity plot based on the full-length genome sequence of Pangolin-CoV. Full-length genome sequences of SARS-CoV-2 (Beta-CoV/Wuhan-Hu-1), BatCoV RaTG13, bat SARSr-CoV 21, bat SARSr-CoV45, bat SARSr-CoV WIV1, and SARS-CoV BJ01 were used as reference sequences. (C) Comparison of common genome organization similarity among SARS-CoV-2, Pangolin-CoV, and BatCoV RaTG13. Related to Table S2.
Figure 2
Figure 2
Phylogenetic Relationship of CoVs Based on the Whole Genome and RdRp Gene Nucleotide Sequences Red text denotes the Malayan Pangolin-CoV. Pink text denotes SARS-CoV-2. Green text denotes a bat CoV with 96% similarity at the genome level to SARS-CoV-2. Blue text denotes the reference CoVs used in Figure 1B. Detailed information can be found in the STAR Methods. Related to Figures S1–S3.
Figure 3
Figure 3
Amino Acid Sequence Alignment of the S1 Protein and Its Phylogeny The receptor-binding motif of SARS-CoV and the homologous region of other CoVs are indicated by the gray box. The key amino acid residues involved in the interaction with human ACE2 are marked with the orange box. Bat SARS-CoV-like CoVs had been reported to not use ACE2 and have amino acid deletions at two motifs marked by the yellow box. Detailed information can be found in the STAR Methods.
Figure 4
Figure 4
CoV S Protein S1/S2 Cleavage Sites Four amino acid insertions (SPRRs) unique to SARS-CoV-2 are marked in yellow. Conserved S1/S2 cleavage sites are marked in green.

References

    1. Liu P., Chen W., Chen J.-P. Viral metagenomics revealed sendai virus and coronavirus infection of Malayan Pangolins (Manis javanica) Viruses. 2019;11:979.
    1. Li W., Shi Z., Yu M., Ren W., Smith C., Epstein J.H., Wang H., Crameri G., Hu Z., Zhang H. Bats are natural reservoirs of SARS-like coronaviruses. Science. 2005;310:676–679.
    1. Zhou P., Yang X.-L., Wang X.-G., Hu B., Zhang L., Zhang W., Si H.-R., Zhu Y., Li B., Huang C.-L. A pneumonia outbreak associated with a new coronavirus of probable bat origin. Nature. 2020 doi: 10.1038/s41586-020-2012-7. Published online February 3, 2020.
    1. Cui J., Li F., Shi Z.-L. Origin and evolution of pathogenic coronaviruses. Nat. Rev. Microbiol. 2019;17:181–192.
    1. Huang C., Wang Y., Li X., Ren L., Zhao J., Hu Y., Zhang L., Fan G., Xu J., Gu X. Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China. Lancet. 2020;395:497–506.
    1. Wu F., Zhao S., Yu B., Chen Y.-M., Wang W., Song Z.-G., Hu Y., Tao Z.-W., Tian J.-H., Pei Y.-Y. A new coronavirus associated with human respiratory disease in China. Nature. 2020 doi: 10.1038/s41586-020-2008-3. Published online February 3, 2020.
    1. Abecasis G.R., Altshuler D., Auton A., Brooks L.D., Durbin R.M., Gibbs R.A., Hurles M.E., McVean G.A., 1000 Genomes Project Consortium A map of human genome variation from population-scale sequencing. Nature. 2010;467:1061–1073.
    1. Albertsen M., Hugenholtz P., Skarshewski A., Nielsen K.L., Tyson G.W., Nielsen P.H. Genome sequences of rare, uncultured bacteria obtained by differential coverage binning of multiple metagenomes. Nat. Biotechnol. 2013;31:533–538.
    1. Lole K.S., Bollinger R.C., Paranjape R.S., Gadkari D., Kulkarni S.S., Novak N.G., Ingersoll R., Sheppard H.W., Ray S.C. Full-length human immunodeficiency virus type 1 genomes from subtype C-infected seroconverters in India, with evidence of intersubtype recombination. J. Virol. 1999;73:152–160.
    1. Xiao K., Zhai J., Feng Y., Zhou N., Zhang X., Zou J.-J., Li N., Guo Y., Li X., Shen X. Isolation and characterization of 2019-nCoV-like coronavirus from Malayan pangolins. bioRxiv. 2020 doi: 10.1101/2020.02.17.951335.
    1. Lam T.T.-Y., Shum M.H.-H., Zhu H.-C., Tong Y.-G., Ni X.-B., Liao Y.-S., Wei W., Cheung W.Y.-M., Li W.-J., Li L.-F. Identification of 2019-nCoV related coronaviruses in Malayan pangolins in southern China. bioRxiv. 2020 doi: 10.1101/2020.02.13.945485.
    1. Tortorici M.A., Veesler D. Structural insights into coronavirus entry. In: Rey F.A., editor. Advances in Virus Research. Academic Press; 2019. pp. 93–116.
    1. Ge X.-Y., Li J.-L., Yang X.-L., Chmura A.A., Zhu G., Epstein J.H., Mazet J.K., Hu B., Zhang W., Peng C. Isolation and characterization of a bat SARS-like coronavirus that uses the ACE2 receptor. Nature. 2013;503:535–538.
    1. Wong S.K., Li W., Moore M.J., Choe H., Farzan M. A 193-amino acid fragment of the SARS coronavirus S protein efficiently binds angiotensin-converting enzyme 2. J. Biol. Chem. 2004;279:3197–3201.
    1. Yan R., Zhang Y., Li Y., Xia L., Guo Y., Zhou Q. Structural basis for the recognition of the SARS-CoV-2 by full-length human ACE2. Science. 2020 doi: 10.1126/science.abb2762. Published online March 4, 2020.
    1. Millet J.K., Whittaker G.R. Host cell entry of Middle East respiratory syndrome coronavirus after two-step, furin-mediated activation of the spike protein. Proc. Natl. Acad. Sci. USA. 2014;111:15214–15219.
    1. Coutard B., Valle C., de Lamballerie X., Canard B., Seidah N.G., Decroly E. The spike glycoprotein of the new coronavirus 2019-nCoV contains a furin-like cleavage site absent in CoV of the same clade. Antiviral Res. 2020;176:104742.
    1. Hoffmann M., Kleine-Weber H., Schroeder S., Krüger N., Herrler T., Erichsen S., Schiergens T.S., Herrler G., Wu N.-H., Nitsche A. SARS-CoV-2 cell entry depends on ACE2 and TMPRSS2 and is blocked by a clinically proven protease inhibitor. Cell. 2020 doi: 10.1016/j.cell.2020.02.052. Published online March 4, 2020.
    1. Millet J.K., Whittaker G.R. Host cell proteases: critical determinants of coronavirus tropism and pathogenesis. Virus Res. 2015;202:120–134.
    1. Bolger A.M., Lohse M., Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics. 2014;30:2114–2120.
    1. Langmead B., Salzberg S.L. Fast gapped-read alignment with Bowtie 2. Nat. Methods. 2012;9:357–359.
    1. Li D., Liu C.-M., Luo R., Sadakane K., Lam T.-W. MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph. Bioinformatics. 2015;31:1674–1676.
    1. Camacho C., Coulouris G., Avagyan V., Ma N., Papadopoulos J., Bealer K., Madden T.L. BLAST+: architecture and applications. BMC Bioinformatics. 2009;10:421.
    1. Li H., Handsaker B., Wysoker A., Fennell T., Ruan J., Homer N., Marth G., Abecasis G., Durbin R., 1000 Genome Project Data Processing Subgroup The Sequence Alignment/Map format and SAMtools. Bioinformatics. 2009;25:2078–2079.
    1. Edgar R.C. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 2004;32:1792–1797.
    1. Talavera G., Castresana J. Improvement of phylogenies after removing divergent and ambiguously aligned blocks from protein sequence alignments. Syst. Biol. 2007;56:564–577.
    1. Kumar S., Stecher G., Li M., Knyaz C., Tamura K. MEGA X: molecular evolutionary genetics analysis across computing platforms. Mol. Biol. Evol. 2018;35:1547–1549.

Source: PubMed

3
Abonnere