DADA2: High-resolution sample inference from Illumina amplicon data
Benjamin J Callahan, Paul J McMurdie, Michael J Rosen, Andrew W Han, Amy Jo A Johnson, Susan P Holmes, Benjamin J Callahan, Paul J McMurdie, Michael J Rosen, Andrew W Han, Amy Jo A Johnson, Susan P Holmes
Abstract
We present the open-source software package DADA2 for modeling and correcting Illumina-sequenced amplicon errors (https://github.com/benjjneb/dada2). DADA2 infers sample sequences exactly and resolves differences of as little as 1 nucleotide. In several mock communities, DADA2 identified more real variants and output fewer spurious sequences than other methods. We applied DADA2 to vaginal samples from a cohort of pregnant women, revealing a diversity of previously undetected Lactobacillus crispatus variants.
Figures
References
- Human Microbiome Project Consortium. Nature. 2012;486:207–214.
- Rosen MJ, Davison M, Bhaya D, Fisher DS. Science. 2015;348:1019–1023.
- Reeder J, Knight R. Nat Methods. 2010;7:668–669.
- Quince C, Lanzen A, Davenport RJ, Turnbaugh PJ. BMC Bioinformatics. 2011;12:38.
- Rosen MJ, Callahan BJ, Fisher DS, Holmes SP. BMC Bioinformatics. 2012;13:283.
- Bragg L, Stone G, Imelfort M, Hugenholtz P, Tyson GW. Nat Methods. 2012;9:425–426.
- Schloss PD, et al. Appl Environ Microbiol. 2009;75:7537–7541.
- Caporaso JG, et al. Nat Methods. 2010;7:335–336.
- Edgar RC. Nat Methods. 2013;10:996–998.
- Eren AM, Borisy GG, Huse SM, Welch JLM. Proc Natl Acad Sci USA. 2014;111:E2875–E2884.
- Eren AM, Morrison HG, Lescault PJ, Reveillaud J, Vineis JH, Sogin ML. ISME J. 2015;9:968–979.
- Tikhonov M, Leach RW, Wingreen NS. ISME J. 2015;9:68–80.
- Wang C, Mitsuya Y, Gharizadeh B, Ronaghi M, Shafer RW. Genome Research. 2007;17:1195–1201.
- McElroy K, Zagordi O, Bull R, Luciani F, Beerenwinkel N. BMC Genomics. 2013;14:501.
- Guarner F. Nat Rev Gastroenterol Hepatol. 2014;11:647–649.
- Schirmer M, et al. Nucleic Acids Res. 2015;43:e37.
- Kozich JJ, Westcott SL, Baxter NT, Highlander SK, Schloss PD. Appl Environ Microbiol. 2013;79:5112–5120.
- Edgar RC, Flyvbjerg H. Bioinformatics. 2015;31:3476–3482.
- MacIntyre DA, et al. Sci Rep. 2015;5:8988.
- Ravel J, et al. Proc Natl Acad Sci USA. 2011;108(Supplement 1):4680–4687.
- Sun Y, et al. ESPRIT: estimating species richness using large collections of 16S rRNA pyrosequences. Nucleic Acids Res. 2009;37:e76.
- Caporaso JG, et al. ISME J. 2012;6:1621.
- Edgar RC, Haas BJ, Clemente JC, Quince C, Knight R. Bioinformatics. 2011;27:2194–2200.
Source: PubMed