Using MALDI-TOF spectra in epidemiological surveillance for the detection of bacterial subgroups with a possible epidemic potential

Audrey Giraud-Gatineau, Gaetan Texier, Pierre-Edouard Fournier, Didier Raoult, Hervé Chaudet, Audrey Giraud-Gatineau, Gaetan Texier, Pierre-Edouard Fournier, Didier Raoult, Hervé Chaudet

Abstract

Background: For the purpose of epidemiological surveillance, the Hospital University Institute Méditerranée infection has implemented since 2013 a system named MIDaS, based on the systematic collection of routine activity materials, including MALDI-TOF spectra, and results. The objective of this paper is to present the pipeline we use for processing MALDI-TOF spectra during epidemiological surveillance in order to disclose proteinic cues that may suggest the existence of epidemic processes in complement of incidence surveillance. It is illustrated by the analysis of an alarm observed for Streptococcus pneumoniae.

Methods: The MALDI-TOF spectra analysis process looks for the existence of clusters of spectra characterized by a double time and proteinic close proximity. This process relies on several specific methods aiming at contrasting and clustering the spectra, presenting graphically the results for an easy epidemiological interpretation, and for determining the discriminating spectra peaks with their possible identification using reference databases.

Results: The use of this pipeline in the case of an alarm issued for Streptococcus pneumoniae has made it possible to reveal a cluster of spectra with close proteinic and temporal distances, characterized by the presence of three discriminant peaks (5228.8, 5917.8, and 8974.3 m/z) and the absence of peak 4996.9 m/z. A further investigation on UniProt KB showed that peak 5228.8 is possibly an OxaA protein and that the absent peak may be a transposase.

Conclusion: This example shows this pipeline may support a quasi-real time identification and characterization of clusters that provide essential information on a potentially epidemic situation. It brings valuable information for epidemiological sensemaking and for deciding on the continuation of the epidemiological investigation, in particular the involving of additional costly resources to confirm or invalidate the alarm.

Clinical trials registration: NCT03626987.

Keywords: Cluster analysis; Epidemic; Epidemiological surveillance; MALDI-TOF.

Conflict of interest statement

All authors report no potential conflicts.

© 2021. The Author(s).

Figures

Fig. 1
Fig. 1
Process flow of the matrix-assisted laser desorption ionization mass spectrometry mass spectra analysis, from databases to system outputs
Fig. 2
Fig. 2
Number of Streptococcus pneumoniae samples from 10 October 2019 to 12 March 2020 at AP-HM, Marseille
Fig. 3
Fig. 3
Complete Time-heated dendrogram of the 125 main spectrum profiles of S. pneumoniae illustrating the use of leaf label coloring. The colorscale shows the case recency, from to most ancient (blue color) to the most recent (red color), and then case concomitance. The interest cluster (red square) is indicated with an enlargement of the dendodragrm illustrating the possible leaf labelling using surveillance data. The stars near the leaves are the spectra involved in the alarm emitted by BALYSES
Fig. 4
Fig. 4
Double time-protein proximity heatmap resulting from the analysis of the 125 main spectrum profiles (MSP) of S. pneumoniae. The interest cluster is indicated. The bottom-right hemi-matrix shows the samples’ proteinic proximity. The top-left hemi-matrix shows the samples’ time concomitance. The colorscale is the same for the two hemi-matrices: blue corresponds to the largest distances and red to the closest ones. Spectra with closed time-protein distances appears as a square in hot colour along the matrix diagonal, as subtree A. The stars near the leaves are the spectra involved in the alarm emitted by BALYSES. Subtree D has a less epidemiological interest with a temporal heterogeneity
Fig. 5
Fig. 5
Binary discriminant analysis of the 125 main spectrum profiles (MSP) of S. pneumoniae showing the 40 top ranking peaks contrasting the 7 samples belonging to interest cluster against the other ones. Peaks are indicated using their m/z. For each selected peak the entropic ranking t-score is represented, positive when the peak is associated with the group

References

    1. Langmuir AD. The surveillance of communicable diseases of national importance. N Engl J Med. 1963;268:182–192. doi: 10.1056/NEJM196301242680405.
    1. Thacker SB, Birkhead GS. Surveillance field epidemiology. 2. New York: Oxford University Press; 2002. pp. 26–29.
    1. Abat C, Chaudet H, Rolain JM, Colson P, Raoult D. Traditional and syndromic surveillance of infectious diseases and pathogens. Int J Infect Dis. 2016;48:22–28. doi: 10.1016/j.ijid.2016.04.021.
    1. Seng P, Drancourt M, Gouriet F, et al. Ongoing revolution in bacteriology: routine identification of bacteria by matrix-assisted laser desorption ionization time-of-flight mass spectrometry. Clin Infect Dis. 2009;49(4):543–551. doi: 10.1086/600885.
    1. Chaudet H, Pellegrin L, Gaudin C, Texier G, Queyriaux B, Meynard JB, Boutin JP. A model-Based architecture for supporting situational diagnosis in real-time surveillance systems. Adv Dis Surveill. 2007;4:152.
    1. Sintchenko V, Gallego B. Laboratory-guided detection of disease outbreaks: three generations of surveillance systems. Arch Pathol Lab Med. 2009;133(6):916–925. doi: 10.5858/133.6.916.
    1. Abat C, Chaudet H, Colson P, Rolain JM, Raoult D. Real-time microbiology laboratory surveillance system to detect abnormal events and emerging infections, Marseille. France Emerg Infect Dis. 2015;21(8):1302–1310. doi: 10.3201/eid2108.141419.
    1. Foxman B, Riley L. Molecular epidemiology: focus on infection. Am J Epidemiol. 2001;153(12):1135–1141. doi: 10.1093/aje/153.12.1135.
    1. Sintchenko V, Iredell JR, Gilbert GL. Pathogen profiling for disease management and surveillance. Nat Rev Microbiol. 2007;5(6):464–470. doi: 10.1038/nrmicro1656.
    1. Christner M, Trusch M, Rohde H, et al. Rapid MALDI-TOF mass spectrometry strain typing during a large outbreak of Shiga-Toxigenic Escherichia coli. PLoS ONE. 2014;9(7):e101924. doi: 10.1371/journal.pone.0101924.
    1. Griffin PM, Price GR, Schooneveldt JM, et al. Use of matrix-assisted laser desorption ionization-time of flight mass spectrometry to identify vancomycin-resistant enterococci and investigate the epidemiology of an outbreak. J Clin Microbiol. 2012;50(9):2918–2931. doi: 10.1128/JCM.01000-12.
    1. Berrazeg M, Diene SM, Drissi M, et al. Biotyping of multidrug-resistant Klebsiella pneumoniae clinical isolates from France and Algeria using MALDI-TOF MS. PLoS ONE. 2013;8(4):e61428. doi: 10.1371/journal.pone.0061428.
    1. Khennouchi NC, Loucif L, Boutefnouchet N, Allag H, Rolain JM. MALDI-TOF MS as a tool to detect a nosocomial outbreak of extended-spectrum-β-lactamase- and ArmA methyltransferase-producing Enterobacter cloacae clinical isolates in Algeria. Antimicrob Agents Chemother. 2015;59(10):6477–6483. doi: 10.1128/AAC.00615-15.
    1. Mlaga KD, Dubourg G, Abat C, et al. Using MALDI-TOF MS typing method to decipher outbreak: the case of Staphylococcus saprophyticus causing urinary tract infections (UTIs) in Marseille, France. Eur J Clin Microbiol Infect Dis. 2017;36(12):2371–2377. doi: 10.1007/s10096-017-3069-6.
    1. Texier G, Pellegrin L, Vignal C, Meynard JB, Deparis X, Chaudet H. Dealing with uncertainty when using a surveillance system. Int J Med Inform. 2017;104:65–73. doi: 10.1016/j.ijmedinf.2017.05.006.
    1. Salmon M, Schumacher D, Höhle M. Monitoring count time series in R: aberration detection in public health surveillance. J Stat Softw. 2016;70(10):1–35. doi: 10.18637/jss.v070.i10.
    1. Rolfhamre P. Outbreak detection of communicable diseases – design, analysis and evaluation of three models for statistically detecting outbreaks in epidemiological data of communicable diseases [Master’s thesis]. Stockholm: Department of Numerical Analysis and Computer Science, Stockholm University, 2003 . Accessed 6 July 2021.
    1. R Core Team. 2018. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria.
    1. Gibb S, Strimmer K. MALDIquant: a versatile R package for the analysis of mass spectrometry data. Bioinformatics. 2012;28(17):2270–2271. doi: 10.1093/bioinformatics/bts447.
    1. Michael H, Hornik K, Buchta C. Getting things in order: an introduction to the R package seriation. J Stat Softw. 2008;25(3):1–34.
    1. Gibb S, Strimmer K. Differential protein expression and peak selection in mass spectrometry data by binary discriminant analysis. Bioinformatics. 2015;31(19):3156–3162. doi: 10.1093/bioinformatics/btv334.
    1. Palarea-Albaladejo J, Mclean K, Wright F, Smith DGE. MALDIrppa: quality control and robust analysis for mass spectrometry data. Bioinformatics. 2018;34(3):522–523. doi: 10.1093/bioinformatics/btx628.
    1. Giraud-Gatineau A, Texier G, Garnotel E, Raoult D, Chaudet H. Insights into subspecies discrimination potentiality from bacteria MALDI-TOF mass spectra by using data mining and diversity studies. Front Microbiol. 2020;11:1931. doi: 10.3389/fmicb.2020.01931.
    1. Gruvaeus G, Wainer H. Two additions to hierarchical cluster analysis. Br J Math Stat Psychol. 1972;25:200–206. doi: 10.1111/j.2044-8317.1972.tb00491.x.
    1. Sintchenko V, Holmes EC. The role of pathogen genomics in assessing disease transmission. BMJ. 2015;350:h1314. doi: 10.1136/bmj.h1314.
    1. Arlot S, Celisse A. A survey of cross-validation procedures for model selection. Stat Surv. 2010;4:40–79. doi: 10.1214/09-SS054.
    1. GBD 2017 Causes of Death Collaborators Global, regional, and national age-sex-specific mortality for 282 causes of death in 195 countries and territories, 1980–2017: a systematic analysis for the Global Burden of Disease Study 2017. Lancet. 2018;392(10159):1736–1788. doi: 10.1016/S0140-6736(18)32203-7.
    1. Herd M, Kocks C. Gene fragments distinguishing an epidemic-associated strain from a virulent prototype strain of Listeria monocytogenes belong to a distinct functional subset of genes and partially cross-hybridize with other Listeria species. Infect Immun. 2001;69(6):3972–3979. doi: 10.1128/IAI.69.6.3972-3979.2001.
    1. Faruque SM, Chowdhury N, Kamruzzaman M, et al. Genetic diversity and virulence potential of environmental Vibrio cholerae population in a cholera-endemic area. Proc Natl Acad Sci USA. 2004;101(7):2123–2128. doi: 10.1073/pnas.0308485100.
    1. Freitas AR, Tedim AP, Francia MV, et al. Multilevel population genetic analysis of vanA and vanB Enterococcus faecium causing nosocomial outbreaks in 27 countries (1986–2012) J Antimicrob Chemother. 2016;71(12):3351–3366. doi: 10.1093/jac/dkw312.
    1. Bryant JM, Grogono DM, Greaves D, et al. Whole-genome sequencing to identify transmission of Mycobacterium abscessus between patients with cystic fibrosis: a retrospective cohort study. Lancet. 2013;381(9877):1551–1560. doi: 10.1016/S0140-6736(13)60632-7.
    1. Eyre DW, Cule ML, Wilson DJ, et al. Diverse sources of C. difficile infection identified on whole-genome sequencing. N Engl J Med. 2013;369(13):1195–1205. doi: 10.1056/NEJMoa1216064.
    1. Croucher NJ, Didelot X. The application of genomics to tracing bacterial pathogen transmission. Curr Opin Microbiol. 2015;23:62–67. doi: 10.1016/j.mib.2014.11.004.
    1. Kan B, Zhou H, Du P, et al. Transforming bacterial disease surveillance and investigation using whole-genome sequence to probe the trace. Front Med. 2018;12(1):23–33. doi: 10.1007/s11684-017-0607-7.
    1. Galardini M, Koumoutsi A, Herrera-Dominguez L, et al. Phenotype inference in an Escherichia coli strain panel. Elife. 2017;6:e31035. doi: 10.7554/eLife.31035.

Source: PubMed

3
Subscribe