SARS-CoV-2 outbreak in a tri-national urban area is dominated by a B.1 lineage variant linked to a mass gathering event

Madlen Stange, Alfredo Mari, Tim Roloff, Helena Mb Seth-Smith, Michael Schweitzer, Myrta Brunner, Karoline Leuzinger, Kirstine K Søgaard, Alexander Gensch, Sarah Tschudin-Sutter, Simon Fuchs, Julia Bielicki, Hans Pargger, Martin Siegemund, Christian H Nickel, Roland Bingisser, Michael Osthoff, Stefano Bassetti, Rita Schneider-Sliwa, Manuel Battegay, Hans H Hirsch, Adrian Egli, Madlen Stange, Alfredo Mari, Tim Roloff, Helena Mb Seth-Smith, Michael Schweitzer, Myrta Brunner, Karoline Leuzinger, Kirstine K Søgaard, Alexander Gensch, Sarah Tschudin-Sutter, Simon Fuchs, Julia Bielicki, Hans Pargger, Martin Siegemund, Christian H Nickel, Roland Bingisser, Michael Osthoff, Stefano Bassetti, Rita Schneider-Sliwa, Manuel Battegay, Hans H Hirsch, Adrian Egli

Abstract

The first case of SARS-CoV-2 in Basel, Switzerland was detected on February 26th 2020. We present a phylogenetic study to explore viral introduction and evolution during the exponential early phase of the local COVID-19 outbreak from February 26th until March 23rd. We sequenced SARS-CoV-2 naso-oropharyngeal swabs from 746 positive tests that were performed at the University Hospital Basel during the study period. We successfully generated 468 high quality genomes from unique patients and called variants with our COVID-19 Pipeline (COVGAP), and analysed viral genetic diversity using PANGOLIN taxonomic lineages. To identify introduction and dissemination events we incorporated global SARS-CoV-2 genomes and inferred a time-calibrated phylogeny. Epidemiological data from patient questionnaires was used to facilitate the interpretation of phylogenetic observations. The early outbreak in Basel was dominated by lineage B.1 (83·6%), detected first on March 2nd, although the first sample identified belonged to B.1.1. Within B.1, 68·2% of our samples fall within a clade defined by the SNP C15324T ('Basel cluster'), including 157 identical sequences at the root of the 'Basel cluster', some of which we can specifically trace to regional spreading events. We infer the origin of B.1-C15324T to mid-February in our tri-national region. The other genomes map broadly over the global phylogenetic tree, showing several introduction events from and/or dissemination to other regions of the world via travellers. Family transmissions can also be traced in our data. A single lineage variant dominated the outbreak in the Basel area while other lineages, such as the first (B.1.1), did not propagate. A mass gathering event was the predominant initial source of cases, with travel returners and family transmissions to a lesser extent. We highlight the importance of adding specific questions to epidemiological questionnaires, to obtain data on attendance of large gatherings and their locations, as well as travel history, to effectively identify routes of transmissions in up-coming outbreaks. This phylogenetic analysis in concert with epidemiological and contact tracing data, allows connection and interpretation of events, and can inform public health interventions. Trial Registration: ClinicalTrials.gov NCT04351503.

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1. Epidemiological curve of the first…
Fig 1. Epidemiological curve of the first COVID-19 wave in Basel-City and region, Switzerland.
Positive (red line, dark grey area) and negative (grey line, light grey area) SARS-CoV-2 PCR tests are depicted, from the beginning of the outbreak in February to March 23rd 2020. Major events and imposed restrictions are marked by horizontal lines. The first confirmed cases in Switzerland and Basel were on February 25th and February 26th, respectively.
Fig 2. SARS-CoV-2 lineage diversity in Switzerland…
Fig 2. SARS-CoV-2 lineage diversity in Switzerland and neighbouring countries during the study period from the first detected case on February 26th to March 23rd 2020, divided into four weeks (I: February 26th–March 3rd, II: March 4th - 10th, III: March 11th - 17th, IV: March 18th - 23rd (6 days)).
A. Daily cases and lineage identity in the Basel area cohort. Low-abundant lineages increased one week after the end of winter school vacation (March 8th) and were introduced by travel returners. B. SARS-CoV-2 lineage diversity in the Basel area, the rest of Switzerland, and neighbouring countries. Number of genomes per week are represented in the inner margin of the time wheel. Number of lineages (L) per country counted from onset of the epidemic in each country until March 23rd 2020. Link to the baselayer: https://ec.europa.eu/eurostat/web/gisco/geodata/reference-data/administrative-units-statistical-units/nuts.
Fig 3. SARS-CoV-2 phylogeny of Basel area…
Fig 3. SARS-CoV-2 phylogeny of Basel area samples and genetic lineages (PANGOLIN) in a global context.
A. Time tree of SARS-CoV-2 genomes from the Basel area cohort as well as subsampled global genomes (30 genomes per country and month), coloured by continent of origin. Amino acid mutations at internal nodes representing clade defining mutations are shown. B. Mirrored time tree coloured by genetic lineages sensu PANGOLIN v.May19 (https://github.com/cov-lineages/). Each tip with a circle represents a genome from the Basel area cohort, branches without circled tips represent global genomes, showing the global context of the Basel genomes.
Fig 4. Divergence trees plotting nucleotide divergence…
Fig 4. Divergence trees plotting nucleotide divergence between 468 genomes and expanded clusters of genomes in selected phylogenetic lineages.
A. Genomes from the Basel area cohort in global context. Tree composition is identical to the time tree from Fig 3. Branches with circles at the tip represent genomes from the present study; branches without circles represent global genomes from GISAID. The major Basel cluster contains genomes with up to five mutations from its ancestral node. B. Detail of a mixed cluster derived from B.1.1 with seven to nine mutations to the root. A single genome assigned to lineage B.1.1.6 has an assumed origin in Austria; two genomes (B.1.1.10) most likely originate from the UK. C. Potential ski-holiday related cluster (C1059T) with seven samples had confirmed associations with skiing destinations. D. Proportion of B.1-C15324T genomes among all publicly available genomes per country from the five countries in which this variant was first registered to March 23rd. Note: Divergence scale can be translated to number of mutations difference to the root (Wuhan-Hu-1) by multiplication with the SARS-CoV-2 genome size (29903 bases).

References

    1. Tayoun A. et al.. Genomic surveillance and phylogenetic analysis reveal multiple introductions of SARS-CoV-2 into a global travel hub in the Middle East. BioRxiv 2020.05.06.080606 (2020).
    1. Banu S. et al. A distinct phylogenetic cluster of Indian Severe Acute Respiratory Syndrome Coronavirus 2 isolates. Open Forum Infectious Diseases, Volume 7, Issue 11, ofaa434, 10.1093/ofid/ofaa434 (2020).
    1. Lu J. et al. Genomic Epidemiology of SARS-CoV-2 in Guangdong Province, China. Cell 181, 997–1003.e1009, 10.1016/j.cell.2020.04.023 (2020).
    1. Meredith L. W. et al. Rapid implementation of SARS-CoV-2 sequencing to investigate cases of health-care associated COVID-19: a prospective genomic surveillance study. Lancet Infect Dis, 10.1016/S1473-3099(20)30562-4 (2020).
    1. Seemann T., Lane C.R., Sherry N.L. et al.. Tracking the COVID-19 pandemic in Australia using genomics. Nat Commun 11, 4376, 10.1038/s41467-020-18314-x (2020).
    1. Candido D. S. et al. Evolution and epidemic spread of SARS-CoV-2 in Brazil. Science, eabd2161, 10.1126/science.abd2161 (2020).
    1. Díez-Fuertes F. et al. Phylodynamics of SARS-CoV-2 transmission in Spain. bioRxiv, 2020.2004.2020.050039, 10.1101/2020.04.20.050039 (2020).
    1. Gámbaro F. et al. “Introductions and early spread of SARS-CoV-2 in France, 24 January to 23 March 2020.” Euro surveillance: bulletin Europeen sur les maladies transmissibles = European communicable disease bulletin vol. 25,26: 2001200. 10.2807/1560-7917.ES.2020.25.26.2001200 (2020).
    1. Gudbjartsson D. F. et al. Spread of SARS-CoV-2 in the Icelandic Population. N Engl J Med 382, 2302–2315, 10.1056/NEJMoa2006100 (2020).
    1. Kumar P. et al. “Integrated genomic view of SARS-CoV-2 in India.” Wellcome open research vol. 5 184, 10.12688/wellcomeopenres.16119.1 (2020).
    1. Zehender G. et al. Genomic characterization and phylogenetic analysis of SARS-COV-2 in Italy. J Med Virol, 10.1002/jmv.25794 (2020).
    1. team T. N. Genomic epidemiology of novel coronavirus—Global subsampling, <> (2020).
    1. Hill V. & Rambaut A. Phylodynamic analysis of SARS-CoV-2 | Update 2020-03-06, <> (2020).
    1. Rambaut A. et al. A dynamic nomenclature proposal for SARS-CoV-2 lineages to assist genomic epidemiology. Nat Microbiol, 10.1038/s41564-020-0770-5 (2020).
    1. Zhang L. et al. The D614G mutation in the SARS-CoV-2 spike protein reduces S1 shedding and increases infectivity. bioRxiv, 10.1101/2020.06.12.148726 (2020).
    1. van Dorp L. et al. No evidence for increased transmissibility from recurrent mutations in SARS-CoV-2. Nat Commun 11, 5986. (2020). 10.1038/s41467-020-19818-2
    1. Grenzgänger, <> (accessed Jul 30, 2020).
    1. . Swiss hospitals to take French coronavirus patients, .
    1. Elbe S. & Buckland-Merrett G. Data, disease and diplomacy: GISAID’s innovative contribution to global health. Glob Chall 1, 33–46, 10.1002/gch2.1018 (2017).
    1. Shu Y. & McCauley J. GISAID: Global initiative on sharing all influenza data—from vision to reality. Euro Surveill 22, 30494, 10.2807/1560-7917.ES.2017.22.13.30494 (2017).
    1. Bluhm A. et al. SARS-CoV-2 transmission routes from genetic data: A Danish case study. PLOS ONE 15(10): e0241405. (2020). 10.1371/journal.pone.0241405
    1. Versteeg B. et al. Genomic analyses of the Chlamydia trachomatis core genome show an association between chromosomal genome, plasmid type and disease. BMC genomics 19, 130, 10.1186/s12864-018-4522-3 (2018).
    1. Xing Y., Li X., Gao X. & Dong Q. MicroGMT: A Mutation Tracker for SARS-CoV-2 and Other Microbial Genome Sequences. Front Microbiol 11, 1502, 10.3389/fmicb.2020.01502 (2020).
    1. Pybus O. G. et al. Preliminary analysis of SARS-CoV-2 importation & establishment of UK transmission lineages, <> (2020).
    1. Felbermayr G., Hinz J. & S, C. Après-ski: The Spread of Coronavirus from Ischgl through Germany, <> (2020).
    1. Hodcroft E. B. Preliminary case report on the SARS-CoV-2 cluster in the UK, France, and Spain. Swiss Med Wkly 150, 10.4414/smw.2020.20212 (2020).
    1. Zeitung, B. Wir haben in Mulhouse die Epidemie-Phase erreicht, .
    1. Joseph S. J., Didelot X., Gandhi K., Dean D. & Read T. D. Interplay of recombination and selection in the genomes of Chlamydia trachomatis. Biol Direct 6, 28, 10.1186/1745-6150-6-28 (2011).
    1. Goldenberger D. et al. Brief validation of the novel GeneXpert Xpress SARS-CoV-2 PCR assay. J Virol Methods 284, 113925, 10.1016/j.jviromet.2020.113925 (2020).
    1. Leuzinger K. et al. Epidemiology of SARS-CoV-2 Emergence Amidst Community-Acquired Respiratory Viruses. J Infect Dis, 10.1093/infdis/jiaa464 (2020).
    1. Quick J. nCoV-2019 sequencing protocol. <10.17504/protocols.io.bdp7i5rn> (2020).
    1. Bolger A. M., Lohse M. & Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114–2120, 10.1093/bioinformatics/btu170 (2014).
    1. Wu F. et al. A new coronavirus associated with human respiratory disease in China. Nature 579, 265–269, 10.1038/s41586-020-2008-3 (2020).
    1. Li H. & Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 1754–1760, 10.1093/bioinformatics/btp324 (2009).
    1. Wala J., Zhang C. Z., Meyerson M. & Beroukhim R. VariantBam: filtering and profiling of next-generational sequencing data using region-specific rules. Bioinformatics 32, 2029–2031, 10.1093/bioinformatics/btw111 (2016).
    1. Walker B. J. et al. Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement. PLoS One 9, e112963, 10.1371/journal.pone.0112963 (2014).
    1. Li H. A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data. Bioinformatics 27, 2987–2993, 10.1093/bioinformatics/btr509 (2011).
    1. Hahne F. & Ivanek R. Visualizing Genomic Data Using Gviz and Bioconductor. Methods Mol Biol 1418, 335–351, 10.1007/978-1-4939-3578-9_16 (2016).
    1. Phanstiel D. H., Boyle A. P., Araya C. L. & Snyder M. P. Sushi.R: flexible, quantitative and integrative genomic visualizations for publication-quality multi-panel figures. Bioinformatics 30, 2808–2810, 10.1093/bioinformatics/btu379 (2014).
    1. D., C. & J.R., L. in Structural Approaches to Sequence Evolution. Biological and Medical Physics, Biomedical Engineering (eds Bastolla U., Porto M., Roman H.E., & Vendruscolo M.) (Springer, 2007). 10.1186/1471-2105-8-425
    1. Wickham H. ggplot2: Elegant Graphics for Data Analysis, Springer-Verlag; New York. ISBN 978-3-319-24277-4, (2016).
    1. Holland L. A. et al. An 81-Nucleotide Deletion in SARS-CoV-2 ORF7a Identified from Sentinel Surveillance in Arizona (January to March 2020). J Virol 94, 10.1128/jvi.00711-20 (2020).
    1. Gouy M., Guindon S. & Gascuel O. SeaView version 4: a multiplatform graphical user interface for sequence alignment and phylogenetic tree building. Mol Biol Evol 27, 221–224 (2010). 10.1093/molbev/msp259
    1. Sievers F. et al. Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. Molecular systems biology 7, 539–539, 10.1038/msb.2011.75 (2011).
    1. Chao A., Chiu C. H. & Hsieh T. C. Proposing a resolution to debates on diversity partitioning. Ecology 93, 2037–2051, 10.1890/11-1817.1 (2012).
    1. Chao A. & Jost L. Estimating diversity and entropy profiles via discovery rates of new species. Methods in Ecology and Evolution 6, 873–882, 10.1111/2041-210X.12349 (2015).
    1. Chao A., Wang Y. & Jost L. Entropy and the species accumulation curve: A novel entropy estimator via discovery rates of new species. Methods Ecol Evol 4, 1091–1100, 10.1111/2041-210X.12108 (2013).
    1. Nadeau S. et al. Quantifying SARS-CoV-2 spread in Switzerland based on genomic sequencing data. medRxiv, 2020.2010.2014.20212621, 10.1101/2020.10.14.20212621 (2020).
    1. Cock P. J. et al. Biopython: freely available Python tools for computational molecular biology and bioinformatics. Bioinformatics 25, 1422–1423, 10.1093/bioinformatics/btp163 (2009).
    1. Hadfield J. et al. Nextstrain: real-time tracking of pathogen evolution. Bioinformatics 34, 4121–4123, 10.1093/bioinformatics/bty407 (2018).
    1. R Core Team. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL (2020).
    1. Wickham H. & Henry L. tidyr: Tidy Messy Data. R package version 1.1.0., <> (2020).
    1. Wickham H., François R., Henry L. & Müller K. dplyr: A Grammar of Data Manipulation. R package version 1.0.0., <> (2020).
    1. Wickham H., J, H. & Francois R. readr: Read Rectangular Text Data. R package version 1.3.1., <> (2018).
    1. Katoh K. & Standley D. M. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol Biol Evol 30, 772–780, 10.1093/molbev/mst010 (2013).
    1. De Maio, N., C, W. & N, G. <> (accessed 30 May 2020).
    1. Minh B. Q. et al. IQ-TREE 2: New Models and Efficient Methods for Phylogenetic Inference in the Genomic Era. Mol Biol Evol 37, 1530–1534, 10.1093/molbev/msaa015 (2020).
    1. Sagulenko P., Puller V. & Neher R. A. TreeTime: Maximum-likelihood phylodynamic analysis. Virus Evol 4, vex042, 10.1093/ve/vex042 (2018).
    1. GISAID - Clade and lineage nomenclature aids in genomic epidemiology of active hCoV-19 viruses, <> (accessed 21 August 2020).

Source: PubMed

3
Sottoscrivi