Structure, Function, and Evolution of Coronavirus Spike Proteins

Fang Li, Fang Li

Abstract

The coronavirus spike protein is a multifunctional molecular machine that mediates coronavirus entry into host cells. It first binds to a receptor on the host cell surface through its S1 subunit and then fuses viral and host membranes through its S2 subunit. Two domains in S1 from different coronaviruses recognize a variety of host receptors, leading to viral attachment. The spike protein exists in two structurally distinct conformations, prefusion and postfusion. The transition from prefusion to postfusion conformation of the spike protein must be triggered, leading to membrane fusion. This article reviews current knowledge about the structures and functions of coronavirus spike proteins, illustrating how the two S1 domains recognize different receptors and how the spike proteins are regulated to undergo conformational transitions. I further discuss the evolution of these two critical functions of coronavirus spike proteins, receptor recognition and membrane fusion, in the context of the corresponding functions from other viruses and host cells.

Keywords: coronavirus spike protein; membrane fusion; postfusion conformation; prefusion conformation; receptor binding; virus evolution; virus origin.

Figures

Figure 1
Figure 1
Introduction to coronaviruses and their spike proteins. (a) Classification of coronaviruses. Representative coronaviruses in each genus are human coronavirus NL63 (HCoV-NL63), porcine transmissible gastroenteritis coronavirus (TGEV), porcine epidemic diarrhea coronavirus (PEDV), and porcine respiratory coronavirus (PRCV) in the genus Alphacoronavirus; severe acute respiratory syndrome coronavirus (SARS-CoV), Middle East respiratory syndrome coronavirus (MERS-CoV), bat coronavirus HKU4, mouse hepatitis coronavirus (MHV), bovine coronavirus (BCoV), and human coronavirus OC43 in the genus Betacoronavirus; avian infectious bronchitis coronavirus (IBV) in the genus Gammacoronavirus; and porcine deltacoronavirus (PdCV) in the genus Deltacoronavirus. (b) Schematic of the overall structure of prefusion coronavirus spikes. Shown are the receptor-binding subunit S1, the membrane-fusion subunit S2, the transmembrane anchor (TM), the intracellular tail (IC), and the viral envelope. (c) Schematic of the domain structure of coronavirus spikes, including the S1 N-terminal domain (S1-NTD), the S1 C-terminal domain (S1-CTD), the fusion peptide (FP), and heptad repeat regions N and C (HR-N and HR-C). Scissors indicate two proteolysis sites in coronavirus spikes. (d) Summary of the structures and functions of coronavirus spikes. Host receptors recognized by either of the S1 domains are angiotensin-converting enzyme 2 (ACE2), aminopeptidase N (APN), dipeptidyl peptidase 4 (DPP4), carcinoembryonic antigen-related cell adhesion molecule 1 (CEACAM1), and sugar. The available crystal structures of S1 domains and S2 HRs are shown. Their PDB IDs are 3KBH for HCoV-NL63 S1-CTD, 4F5C for PRCV S1-CTD, 2AJF for SARS-CoV S1-CTD, 4KR0 for MERS-CoV S1-CTD, 3R4D for MHV S1-NTD, 4H14 for BCoV S1-NTD, 2IEQ for HCoV-NL63 HRs, 1WYY for SARS-CoV HRs, 4NJL for MERS-CoV HRs, and 1WDF for MHV HRs.
Figure 2
Figure 2
Cryo-electron microscopy structures of prefusion trimeric coronavirus spikes. (a) Trimeric mouse hepatitis coronavirus (MHV) spike (PDB ID: 3JCL) (16). Three monomers are shown (magenta, cyan, and green). (b) One monomer from the trimeric MHV spike. The important functional elements of the spike [the S1 N-terminal domain (S1-NTD), the S1 C-terminal domain (S1-CTD), the fusion peptide (FP), and the heptad repeat (HR-N)] are colored in the same way as in Figure 1c. The dotted curve indicates a disordered loop. Scissors indicate two critical proteolysis sites.
Figure 3
Figure 3
Crystal structures of betacoronavirus S1 C-terminal domains (S1-CTDs). (a) Structure of severe acute respiratory syndrome coronavirus (SARS-CoV) S1-CTD complexed with human ACE2 (PDB ID: 2AJF) (52). Shown are the core structure of S1-CTD (cyan), the receptor-binding motif (red), and ACE2 (green). (b) Interface between human SARS-CoV S1-CTD and human ACE2, showing two virus-binding hot spots on human ACE2. Dashed lines indicate salt bridges. (c) Interface between palm civet SARS-CoV S1-CTD and human ACE2. Critical residue changes from human to civet SARS-CoV strains are labeled. (d) Interface between human SARS-CoV S1-CTD and rat or mouse ACE2. Critical residue changes from human to rat or mouse ACE2 are labeled. (e) Structure of Middle East respiratory syndrome coronavirus (MERS-CoV) S1-CTD complexed with human DPP4 (PDB ID: 4KR0) (69).
Figure 4
Figure 4
Crystal structures of alphacoronavirus S1 C-terminal domains (S1-CTDs). (a) Structure of human coronavirus NL63 (HCoV-NL63) S1-CTD complexed with human ACE2 (PDB ID: 4KBH) (83). (b) Structure of porcine respiratory coronavirus (PRCV) S1-CTD complexed with porcine APN (PDB ID: 4F5C) (84). (c) Structural topology of alphacoronavirus S1-CTDs. β-Strands are shown as arrows. (d) Structural topology of betacoronavirus S1-CTDs. α-Helices are shown as cylinders. All of the secondary structural elements in panels c and d are connected in the same order, even though strands β4, β1, and β2 in panel c become helices α4, α1, and a loop, respectively, in panel d. (e) Interface between HCoV-NL63 S1-CTD and human ACE2. Virus-binding motifs (VBMs) on ACE2 are shown in blue. Receptor-binding motifs (RBMs) on S1-CTD are shown in red. (f) Interface between SARS-CoV S1-CTD and human ACE2.
Figure 5
Figure 5
Crystal structures of betacoronavirus S1 N-terminal domains (S1-NTDs). (a) Structure of mouse hepatitis coronavirus (MHV) S1-NTD complexed with murine CEACAM1 (PDB ID: 3R4D) (88). The core structure of MHV S1-NTD is shown in magenta and green, the receptor-binding motifs in red, and the rest in cyan. The N-terminal immunoglobulin domain of CEACAM1 is shown in yellow and virus-binding motifs in blue. (b) Structure of bovine coronavirus (BCoV) S1-NTD (PDB ID: 4H14) (43). The asterisk indicates the binding site for sugar receptor Neu5,9Ac2. (c) Structure of human galectin-3 complexed with galactose (PDB ID: 1A3K). Sugar receptor is shown in blue. (d) Structure of influenza virus HA1 (PDB ID: 1JSO). Sugar receptor is shown in blue. (e–g) Structural topologies of (e) betacoronavirus S1-NTDs, (f) human galectins, and (g) influenza virus HA1.
Figure 6
Figure 6
Structural mechanism for membrane fusion by coronavirus spikes. (a) Structural mechanism for membrane fusion by class I viral membrane fusion proteins. Schematics of these proteins in both prefusion and postfusion conformations are shown. (b) Negative-stain electron microscopy images of SARS-CoV spike in both prefusion and postfusion conformations are shown (18). (c) Schematics of SARS-CoV postfusion S2 in solution (left) and in vivo (right). Abbreviations: FP, fusion peptide; HR-N, heptad repeat region N; HR-C, heptad repeat region C; IC, intracellular tail; SARS-CoV, severe acute respiratory syndrome coronavirus; TM, transmembrane anchor.
Figure 7
Figure 7
Triggers for coronavirus spikes to fuse membranes. Scissors indicate potential spike-processing host proteases. Shown are virus particles (green spheres), virus surface spikes (blue protrusions), viral genome (magenta coils), cells (large gray circles), endosome/lysosome (small shaded gray circle), and receptor (light green base on cell surface). Spike-processing host proteases are labeled for representative coronaviruses: Middle East respiratory syndrome coronavirus (MERS-CoV), mouse hepatitis coronavirus (MHV), and severe acute respiratory syndrome coronavirus (SARS-CoV).
Figure 8
Figure 8
Evolution of coronavirus spikes. (a) Structural comparison between human galectins and alphacoronavirus HCoV-NL63 S1-CTD. Both the crystal structures and structural topologies of the two proteins are shown. Common subcore structures in the two proteins are highlighted with gray shading. (b) Hypothesized evolution of coronavirus spike proteins. Abbreviations: HCoV-NL63, human coronavirus NL63; IC, intracellular tail; S1-CTD, S1 C-terminal domain; S1-NTD, S1 N-terminal domain; TM, transmembrane anchor.

Source: PubMed

3
订阅