High precision Neisseria gonorrhoeae variant and antimicrobial resistance calling from metagenomic Nanopore sequencing

Nicholas D Sanderson, Jeremy Swann, Leanne Barker, James Kavanagh, Sarah Hoosdally, Derrick Crook, GonFast Investigators Group, Teresa L Street, David W Eyre, Nicholas D Sanderson, Jeremy Swann, Leanne Barker, James Kavanagh, Sarah Hoosdally, Derrick Crook, GonFast Investigators Group, Teresa L Street, David W Eyre

Abstract

The rise of antimicrobial-resistant Neisseria gonorrhoeae is a significant public health concern. Against this background, rapid culture-independent diagnostics may allow targeted treatment and prevent onward transmission. We have previously shown metagenomic sequencing of urine samples from men with urethral gonorrhea can recover near-complete N. gonorrhoeae genomes. However, disentangling the N. gonorrhoeae genome from metagenomic samples and robustly identifying antimicrobial resistance determinants from error-prone Nanopore sequencing is a substantial bioinformatics challenge. Here, we show an N. gonorrhoeae diagnostic workflow for analysis of metagenomic sequencing data obtained from clinical samples using R9.4.1 Nanopore sequencing. We compared results from simulated and clinical infections with data from known reference strains and Illumina sequencing of isolates cultured from the same patients. We evaluated three Nanopore variant callers and developed a random forest classifier to filter called SNPs. Clair was the most suitable variant caller after SNP filtering. A minimum depth of 20× reads was required to confidently identify resistant determinants over the entire genome. Our findings show that metagenomic Nanopore sequencing can provide reliable diagnostic information in N. gonorrhoeae infection.

© 2020 Sanderson et al.; Published by Cold Spring Harbor Laboratory Press.

Figures

Figure 1.
Figure 1.
Detection of SNPs using QUAL scores alone. Swarm plots of true (orange) and false SNPs (blue) detected by Clair (top), Nanopolish (middle), and Medaka (bottom). Each column is a different sequence. Each row has different y-axis values.
Figure 2.
Figure 2.
Random forest–based variant filtering using Nanopolish, Medaka, and Clair. (A) Receiver operating characteristic (ROC) curve for random forest classifier using different features including Quality (QUAL only, dashed line) and a composite selection of input features (Composite, solid line) for Nanopolish (green), Medaka (orange), and Clair (blue). AUC for each variant caller: Nanopolish 0.86 to 0.98, Medaka 0.93 to 0.97, Clair 0.84 to 0.97, using QUAL and Composite features, respectively. (B) Bar chart of feature importance for composite selection of features used to train the classifier.
Figure 3.
Figure 3.
Effect of read coverage depth on SNP calling for each strain and variant caller. (A) SNP recall by median depth of coverage. (B) False positive SNPs (FP) by median depth of coverage. Color represents different sequences, shapes represent variant callers, circles are Clair, crosses are Medaka, and squares are Nanopolish. Insets show upper and lower regions of the y-axis in more detail for A and B, respectively.
Figure 4.
Figure 4.
Recombination-corrected maximum likelihood tree of metagenomic Nanopore and paired Illumina isolate sequences. All Nanopore consensus sequences were generated from metagenomic sequencing with the exception of H18-208 and WHO Q, which were sequenced from isolates.

References

    1. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. 1990. Basic local alignment search tool. J Mol Biol 215: 403–410. 10.1016/S0022-2836(05)80360-2
    1. Bush SJ, Foster D, Eyre DW, Clark EL, De Maio N, Shaw LP, Stoesser N, Peto TEA, Crook DW, Walker AS. 2020. Genomic diversity affects the accuracy of bacterial single-nucleotide polymorphism-calling pipelines. Gigascience 9: giaa007 10.1093/gigascience/giaa007
    1. Charalampous T, Kay GL, Richardson H, Aydin A, Baldan R, Jeanes C, Rae D, Grundy S, Turner DJ, Wain J, et al. 2019. Nanopore metagenomics enables rapid clinical diagnosis of bacterial lower respiratory infection. Nat Biotechnol 37: 783–792. doi10.1038/s41587-019-0156-5
    1. Chernomor O, von Haeseler A, Minh BQ. 2016. Terrace aware data structure for phylogenomic inference from supermatrices. Syst Biol 65: 997–1008. 10.1093/sysbio/syw037
    1. Cuscó A, Catozzi C, Viñes J, Sanchez A, Francino O. 2018. Microbiota profiling with long amplicons using Nanopore sequencing: full-length 16S rRNA gene and the 16S-ITS-23S of the rrn operon. F1000Res 7: 1755 10.12688/f1000research.16817.1
    1. De Maio N, Shaw LP, Hubbard A, George S, Sanderson ND, Swann J, Wick R, AbuOun M, Stubberfield E, Hoosdally SJ, et al. 2019. Comparison of long-read sequencing technologies in the hybrid assembly of complex bacterial genomes. Microb Genom 5: e000294 10.1099/mgen.0.000294
    1. Demczuk W, Lynch T, Martin I, Van Domselaar G, Graham M, Bharat A, Allen V, Hoang L, Lefebvre B, Tyrrell G, et al. 2015. Whole-genome phylogenomic heterogeneity of Neisseria gonorrhoeae isolates with decreased cephalosporin susceptibility collected in Canada between 1989 and 2013. J Clin Microbiol 53: 191–200. 10.1128/JCM.02589-14
    1. De Silva D, Peters J, Cole K, Cole MJ, Cresswell F, Dean G, Dave J, Thomas DR, Foster K, Waldram A, et al. 2016. Whole-genome sequencing to determine transmission of Neisseria gonorrhoeae: an observational study. Lancet Infect Dis 16: 1295–1303. 10.1016/S1473-3099(16)30157-8
    1. Didelot X, Wilson DJ. 2015. ClonalFrameML: efficient inference of recombination in whole bacterial genomes. PLoS Comput Biol 11: e1004041 10.1371/journal.pcbi.1004041
    1. Di Tommaso P, Chatzou M, Floden EW, Barja PP, Palumbo E, Notredame C. 2017. Nextflow enables reproducible computational workflows. Nat Biotechnol 35: 316–319. 10.1038/nbt.3820
    1. Eyre DW, De Silva D, Cole K, Peters J, Cole MJ, Grad YH, Demczuk W, Martin I, Mulvey MR, Crook DW, et al. 2017. WGS to predict antibiotic MICs for Neisseria gonorrhoeae. J Antimicrob Chemother 72: 1937–1947. 10.1093/jac/dkx067
    1. Eyre DW, Sanderson ND, Lord E, Regisford-Reimmer N, Chau K, Barker L, Morgan M, Newnham R, Golparian D, Unemo M, et al. 2018. Gonorrhoea treatment failure caused by a Neisseria gonorrhoeae strain with combined ceftriaxone and high-level azithromycin resistance, England, February 2018. Euro Surveill 23: 1800323 10.2807/1560-7917.ES.2018.23.27.1800323
    1. Eyre DW, Town K, Street T, Barker L, Sanderson N, Cole MJ, Mohammed H, Pitt R, Gobin M, Irish C, et al. 2019. Detection in the United Kingdom of the Neisseria gonorrhoeae FC428 clone, with ceftriaxone resistance and intermediate resistance to azithromycin, October to December 2018. Euro Surveill 24: 1900147 10.2807/1560-7917.ES.2019.24.10.1900147
    1. Golparian D, Donà V, Sánchez-Busó L, Foerster S, Harris S, Endimiani A, Low N, Unemo M. 2018. Antimicrobial resistance prediction and phylogenetic analysis of Neisseria gonorrhoeae isolates using the Oxford Nanopore MinION sequencer. Sci Rep 8: 17596 10.1038/s41598-018-35750-4
    1. Greig DR, Jenkins C, Gharbia S, Dallman TJ. 2019. Comparison of single-nucleotide variants identified by Illumina and Oxford Nanopore technologies in the context of a potential outbreak of Shiga toxin–producing Escherichia coli. Gigascience 8: giz104 10.1093/gigascience/giz104
    1. Grubaugh ND, Gangavarapu K, Quick J, Matteson NL, De Jesus JG, Main BJ, Tan AL, Paul LM, Brackney DE, Grewal S, et al. 2019. An amplicon-based sequencing framework for accurately measuring intrahost virus diversity using PrimalSeq and iVar. Genome Biol 20: 8 10.1186/s13059-018-1618-7
    1. Jennison AV, Whiley D, Lahra MM, Graham RM, Cole MJ, Hughes G, Fifer H, Andersson M, Edwards A, Eyre D. 2019. Genetic relatedness of ceftriaxone-resistant and high-level azithromycin resistant Neisseria gonorrhoeae cases, United Kingdom and Australia, February to April 2018. Euro Surveill 24: 1900118 10.2807/1560-7917.ES.2019.24.8.1900118
    1. Kim D, Song L, Breitwieser FP, Salzberg SL. 2016. Centrifuge: rapid and sensitive classification of metagenomic sequences. Genome Res 26: 1721–1729. 10.1101/gr.210641.116
    1. Li H. 2018. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34: 3094–3100. 10.1093/bioinformatics/bty191
    1. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R, 1000 Genome Project Data Processing Subgroup. 2009. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25: 2078–2079. 10.1093/bioinformatics/btp352
    1. Luo R, Wong CL, Wong YS, Tang CI, Liu CM, Leung CM, Lam TW. 2020. Exploring the limit of using a deep neural network on pileup data for germline variant calling. Nat Mach Intell 2: 220–227. 10.1038/s42256-020-0167-4
    1. Müller EE, Fayemiwo SA, Lewis DA. 2011. Characterization of a novel β-lactamase-producing plasmid in Neisseria gonorrhoeae: sequence analysis and molecular typing of host gonococci. J Antimicrob Chemother 66: 1514–1517. 10.1093/jac/dkr162
    1. Nasko DJ, Koren S, Phillippy AM, Treangen TJ. 2018. RefSeq database growth influences the accuracy of k-mer-based lowest common ancestor species identification. Genome Biol 19: 165 10.1186/s13059-018-1554-6
    1. Pachulec E, van der Does C. 2010. Conjugative plasmids of Neisseria gonorrhoeae. PLoS One 5: e9962 10.1371/journal.pone.0009962
    1. Page AJ, Cummins CA, Hunt M, Wong VK, Reuter S, Holden MTG, Fookes M, Falush D, Keane JA, Parkhill J. 2015. Roary: rapid large-scale prokaryote pan genome analysis. Bioinformatics 31: 3691–3693. 10.1093/bioinformatics/btv421
    1. Pedregosa F. 2011. Scikit-learn: machine learning in Python. J Mach Learn Res 12: 2825–2830.
    1. Quick J, Loman NJ, Duraffour S, Simpson JT, Severi E, Cowley L, Bore JA, Koundouno R, Dudas G, Mikhail A, et al. 2016. Real-time, portable genome sequencing for Ebola surveillance. Nature 530: 228–232. 10.1038/nature16996
    1. Ruan J, Li H. 2020. Fast and accurate long-read assembly with wtdbg2. Nat Methods 17: 155–158. 10.1038/s41592-019-0669-3
    1. Sanderson ND, Street TL, Foster D, Swann J, Atkins BL, Brent AJ, McNally MA, Oakley S, Taylor A, Peto TEA, et al. 2018. Real-time analysis of nanopore-based metagenomic sequencing from infected orthopaedic devices. BMC Genomics 19: 714 10.1186/s12864-018-5094-y
    1. Seemann T. 2014. Prokka: rapid prokaryotic genome annotation. Bioinformatics 30: 2068–2069. 10.1093/bioinformatics/btu153
    1. Simpson JT, Workman RE, Zuzarte PC, David M, Dursi LJ, Timp W. 2017. Detecting DNA cytosine methylation using nanopore sequencing. Nat Methods 14: 407–410. 10.1038/nmeth.4184
    1. Street TL, Barker L, Sanderson ND, Kavanagh J, Hoosdally S, Cole K, Newnham R, Selvaratnam M, Andersson M, Llewelyn MJ, et al. 2020. Optimizing DNA extraction methods for nanopore sequencing of Neisseria gonorrhoeae directly from urine samples. J Clin Microbiol 58: e01822-19 10.1128/JCM.01822-19
    1. Unemo M. 2015. Current and future antimicrobial treatment of gonorrhoea—the rapidly evolving Neisseria gonorrhoeae continues to challenge. BMC Infect Dis 15: 364 10.1186/s12879-015-1029-2
    1. Unemo M, Fasth O, Fredlund H, Limnios A, Tapsall J. 2009. Phenotypic and genetic characterization of the 2008 WHO Neisseria gonorrhoeae reference strain panel intended for global quality assurance and quality control of gonococcal antimicrobial resistance surveillance for public health purposes. J Antimicrob Chemother 63: 1142–1151. 10.1093/jac/dkp098
    1. Unemo M, Golparian D, Sánchez-Busó L, Grad Y, Jacobsson S, Ohnishi M, Lahra MM, Limnios A, Sikora AE, Wi T, et al. 2016. The novel 2016 WHO Neisseria gonorrhoeae reference strains for global quality assurance of laboratory investigations: phenotypic, genetic and reference genome characterization. J Antimicrob Chemother 71: 3096–3108. 10.1093/jac/dkw288
    1. Watson M, Warr A. 2019. Errors in long-read assemblies can critically affect protein prediction. Nat Biotechnol 37: 124–126. 10.1038/s41587-018-0004-z
    1. Wyllie DH, Sanderson N, Myers R, Peto T, Robinson E, Crook DW, Smith EG, Walker AS. 2018. Control of artifactual variation in reported intersample relatedness during clinical use of a Mycobacterium tuberculosis sequencing pipeline. J Clin Microbiol 56: e00104-18 10.1128/JCM.00104-18

Source: PubMed

3
S'abonner