The Human Oral Microbiome Database: a web accessible resource for investigating oral microbe taxonomic and genomic information

Tsute Chen, Wen-Han Yu, Jacques Izard, Oxana V Baranova, Abirami Lakshmanan, Floyd E Dewhirst, Tsute Chen, Wen-Han Yu, Jacques Izard, Oxana V Baranova, Abirami Lakshmanan, Floyd E Dewhirst

Abstract

The human oral microbiome is the most studied human microflora, but 53% of the species have not yet been validly named and 35% remain uncultivated. The uncultivated taxa are known primarily from 16S rRNA sequence information. Sequence information tied solely to obscure isolate or clone numbers, and usually lacking accurate phylogenetic placement, is a major impediment to working with human oral microbiome data. The goal of creating the Human Oral Microbiome Database (HOMD) is to provide the scientific community with a body site-specific comprehensive database for the more than 600 prokaryote species that are present in the human oral cavity based on a curated 16S rRNA gene-based provisional naming scheme. Currently, two primary types of information are provided in HOMD--taxonomic and genomic. Named oral species and taxa identified from 16S rRNA gene sequence analysis of oral isolates and cloning studies were placed into defined 16S rRNA phylotypes and each given unique Human Oral Taxon (HOT) number. The HOT interlinks phenotypic, phylogenetic, genomic, clinical and bibliographic information for each taxon. A BLAST search tool is provided to match user 16S rRNA gene sequences to a curated, full length, 16S rRNA gene reference data set. For genomic analysis, HOMD provides comprehensive set of analysis tools and maintains frequently updated annotations for all the human oral microbial genomes that have been sequenced and publicly released. Oral bacterial genome sequences, determined as part of the Human Microbiome Project, are being added to the HOMD as they become available. We provide HOMD as a conceptual model for the presentation of microbiome data for other human body sites. Database URL: http://www.homd.org.

References

    1. Aas JA, Paster BJ, Stokes LN, et al. Defining the normal bacterial flora of the oral cavity. J. Clin. Microbiol. 2005;43:5721–5732.
    1. Turnbaugh PJ, Ley RE, Hamady M, et al. The human microbiome project. Nature. 2007;449:804–810.
    1. Verberkmoes NC, Russell AL, Shah M, et al. Shotgun metaproteomics of the human distal gut microbiota. ISME J. 2009;3:179–189.
    1. Kurokawa K, Itoh T, Kuwahara T, et al. Comparative metagenomics revealed commonly enriched gene sets in human gut microbiomes. DNA Res. 2007;14:169–181.
    1. Gill SR, Pop M, Deboy RT, et al. Metagenomic analysis of the human distal gut microbiome. Science. 2006;312:1355–1359.
    1. Peterson J, Garges S, Giovanni M, et al. The NIH Human Microbiome Project. Genome Res. 2009;19:2317–2323.
    1. Paster BJ, Boches SK, Galvin JL, et al. Bacterial diversity in human subgingival plaque. J. Bacteriol. 2001;183:3770–3783.
    1. Izard J. The Forsyth Metagenomic Support Consortium. In: Sasano T, Suzuki O, editors. Building the genomic base-layer of the oral ‘omic’ world. 2010. Interface Oral Health Science 2009: Proceedings of the 3rd International Symposium for Interface Oral Health Science. Springer, New York.
    1. Nelson KE, Fleischmann RD, DeBoy RT, et al. Complete genome sequence of the oral pathogenic bacterium Porphyromonas gingivalis strain W83. J. Bacteriol. 2003;185:5591–5601.
    1. Downes J, Vartoukian SR, Dewhirst FE, et al. Pyramidobacter piscolens gen. nov., sp. nov., a member of the phylum ‘Synergistetes’ isolated from the human oral cavity. Int. J. Syst. Evol. Microbiol. 2009;59:972–980.
    1. Dzink JL, Socransky SS, Haffajee AD. The predominant cultivable microbiota of active and inactive lesions of destructive periodontal diseases. J. Clin. Periodontol. 1988;15:316–323.
    1. Dzink JL, Tanner AC, Haffajee AD, Socransky SS. Gram negative species associated with active destructive periodontal lesions. J. Clin. Periodontol. 1985;12:648–659.
    1. Socransky SS, Haffajee AD. Evidence of bacterial etiology: a historical perspective. Periodontology 2000. 1994;5:7–25.
    1. Tanner AC, Haffer C, Bratthall GT, et al. A study of the bacteria associated with advancing periodontitis in man. J. Clin. Periodontol. 1979;6:278–307.
    1. Tanner A, Maiden MF, Macuch PJ, et al. Microbiota of health, gingivitis, and initial periodontitis. J. Clin. Periodontol. 1998;25:85–98.
    1. Moore WE, Holdeman LV, Cato EP, et al. Bacteriology of moderate (chronic) periodontitis in mature adult humans. Infect. Immun. 1983;42:510–515.
    1. Moore WE, Holdeman LV, Smibert RM, et al. Bacteriology of severe periodontitis in young adult humans. Infect. Immun. 1982;38:1137–1148.
    1. Moore WE, Moore LV. The bacteria of periodontal diseases. Periodontology 2000. 1994;5:66–77.
    1. Paster BJ, Dewhirst FE. Phylogeny of campylobacters, wolinellas, Bacteroides gracilis, and Bacteroides ureolyticus by 16S ribosomal ribonucleic acid sequencing. Int. J. Syst. Bacteriol. 1988;38:56–62.
    1. Chen T, Abbey K, Deng WJ, Cheng MC. The bioinformatics resource for oral pathogens. Nucleic Acids Res. 2005;33:W734–W740.
    1. Zuger J, Luthi-Schaller H, Gmur R. Uncultivated Tannerella BU045 and BU063 are slim segmented filamentous rods of high prevalence but low abundance in inflammatory disease-associated dental plaques. Microbiology. 2007;153:3809–3816.
    1. Altschul SF, Madden TL, Schaffer AA, et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997;25:3389–3402.
    1. Boeckmann B, Bairoch A, Apweiler R, et al. The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003. Nucleic Acids Res. 2003;31:365–370.
    1. Zdobnov EM, Apweiler R. InterProScan–an integration platform for the signature-recognition methods in InterPro. Bioinformatics. 2001;17:847–848.
    1. Bairoch A. The ENZYME database in 2000. Nucleic Acids Res. 2000;28:304–305.
    1. Camon E, Magrane M, Barrell D, et al. The Gene Ontology Annotation (GOA) project: implementation of GO in SWISS-PROT, TrEMBL, and InterPro. Genome Res. 2003;13:662–672.
    1. Kanehisa M. The KEGG database. Novartis Found. Symp. 2002;247:91–101; discussion 101–103, 119–128, 244–252.
    1. Ashburner M, Ball CA, Blake JA, et al. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet. 2000;25:25–29.
    1. Cassman M. Barriers to progress in systems biology. Nature. 2005;438:1079.
    1. Aderem A. Systems biology: its practice and challenges. Cell. 2005;121:511–513.
    1. Liu ET. Systems biology, integrative biology, predictive biology. Cell. 2005;121:505–506.
    1. Chicurel M. Bioinformatics: bringing it all together. Nature. 2002;419 751, 753, 755 passim.
    1. Andersen MT, Foy CA. The development of microarray standards. Anal. Bioanal. Chem. 2005;381:87–89.
    1. Ravichandran V, Sriram RD. Toward data standards for proteomics. Nat. Biotechnol. 2005;23:373–376.
    1. Hoon S, Ratnapu KK, Chia JM, et al. Biopipe: a flexible framework for protocol-based bioinformatics analysis. Genome Res. 2003;13:1904–1915.
    1. Leo P, Marinelli C, Pappadà G, et al. BioWBI: an integrated tool for building and executing bioinformatic analysis workflows. 2004 Bioinformatics Italian Society Meeting (BITS 2004), Padova.
    1. Lu Q, Hao P, Curcin V, et al. KDE Bioscience: Platform for bioinformatics analysis workflows. J. Biomed. Inform. 2006;39:440–450.
    1. Oinn T, Addis M, Ferris J, et al. Taverna: a tool for the composition and enactment of bioinformatics workflows. Bioinformatics. 2004;20:3045–3054.
    1. Tang F, Chua CL, Ho LY, et al. Wildfire: distributed, Grid-enabled workflow construction and execution. BMC Bioinformatics. 2005;6:69.
    1. Finak G, Godin N, Hallett M, et al. BIAS: Bioinformatics Integrated Application Software. Bioinformatics. 2005;21:1745–1746.
    1. Benson DA, Karsch-Mizrachi I, Lipman DJ, et al. GenBank. Nucleic Acids Res. 2006;34:D16–D20.
    1. Cochrane G, Aldebert P, Althorpe N, et al. EMBL Nucleotide Sequence Database: developments in 2005. Nucleic Acids Res. 2006;34:D10–D15.
    1. Okubo K, Sugawara H, Gojobori T, Tateno Y. DDBJ in preparation for overview of research activities behind data submissions. Nucleic Acids Res. 2006;34:D6–D9.
    1. Bult CJ, Blake JA, Richardson JE, et al. The Mouse Genome Database (MGD): integrating biology with the genome. Nucleic Acids Res. 2004;32:D476–D481.
    1. Guldener U, Munsterkotter M, Kastenmuller G, et al. CYGD: the Comprehensive Yeast Genome Database. Nucleic Acids Res. 2005;33:D364–D368.
    1. Hirschman JE, Balakrishnan R, Christie KR, et al. Genome Snapshot: a new resource at the Saccharomyces Genome Database (SGD) presenting an overview of the Saccharomyces cerevisiae genome. Nucleic Acids Res. 2006;34:D442–D445.
    1. Schmid CD, Praz V, Delorenzi M, et al. The Eukaryotic Promoter Database EPD: the impact of in silico primer extension. Nucleic Acids Res. 2004;32:D82–D85.
    1. Gong S, Park C, Choi H, et al. A protein domain interaction interface database: InterPare. BMC Bioinformatics. 2005;6:207.
    1. Pagel P, Kovac S, Oesterheld M, et al. The MIPS mammalian protein-protein interaction database. Bioinformatics. 2005;21:832–834.
    1. Akagi K, Suzuki T, Stephens RM, et al. RTCGD: retroviral tagged cancer gene database. Nucleic Acids Res. 2004;32:D523–D527.
    1. Levine AE, Steffen DL. OrCGDB: a database of genes involved in oral cancer. Nucleic Acids Res. 2001;29:300–302.

Source: PubMed

3
Předplatit