Greengenes, a chimera-checked 16S rRNA gene database and workbench compatible with ARB

T Z DeSantis, P Hugenholtz, N Larsen, M Rojas, E L Brodie, K Keller, T Huber, D Dalevi, P Hu, G L Andersen, T Z DeSantis, P Hugenholtz, N Larsen, M Rojas, E L Brodie, K Keller, T Huber, D Dalevi, P Hu, G L Andersen

Abstract

A 16S rRNA gene database (http://greengenes.lbl.gov) addresses limitations of public repositories by providing chimera screening, standard alignment, and taxonomic classification using multiple published taxonomies. It was found that there is incongruent taxonomic nomenclature among curators even at the phylum level. Putative chimeras were identified in 3% of environmental sequences and in 0.2% of records derived from isolates. Environmental sequences were classified into 100 phylum-level lineages in the Archaea and Bacteria.

Figures

FIG. 1.
FIG. 1.
16S rRNA gene sequencing projects that produced more than 200 full-length records. All projects were submitted to GenBank between October 2000 and February 2006. Sequences were generated from gastrointestinal (GI), soil (SO), vaginal (VG), aerosol (AR), culture collection (CC), insect (IN), water (WA), waste treatment (WT), and fecal (FC) sources as indicated on the x axis. The projects are ordered by sequence count.
FIG. 2.
FIG. 2.
Phylum-level nomenclature shared by independent curators represented as a five-way Venn diagram. Yellow spheres represent the 126 phylum or candidate division names encountered in at least one of the five taxonomy systems (Pace, Hugenholtz, Ludwig, RDP, or NCBI). The numbers in parentheses are the counts for phylum or candidate division names recognized by an individual curator. Clusters of yellow spheres connected by more than one colored web symbolize names recognized by multiple curators. The image was rendered by the AutoFocus software (Aduna B.V., The Netherlands). A complete table of phylum-level nomenclature comparisons is available at http://greengenes.lbl.gov/TaxCompare.

Source: PubMed

3
Sottoscrivi