Gapped BLAST and PSI-BLAST: a new generation of protein database search programs
S F Altschul, T L Madden, A A Schäffer, J Zhang, Z Zhang, W Miller, D J Lipman, S F Altschul, T L Madden, A A Schäffer, J Zhang, Z Zhang, W Miller, D J Lipman
Abstract
The BLAST programs are widely used tools for searching protein and DNA databases for sequence similarities. For protein comparisons, a variety of definitional, algorithmic and statistical refinements described here permits the execution time of the BLAST programs to be decreased substantially while enhancing their sensitivity to weak similarities. A new criterion for triggering the extension of word hits, combined with a new heuristic for generating gapped alignments, yields a gapped BLAST program that runs at approximately three times the speed of the original. In addition, a method is introduced for automatically combining statistically significant alignments produced by BLAST into a position-specific score matrix, and searching the database using this matrix. The resulting Position-Specific Iterated BLAST (PSI-BLAST) program runs at approximately the same speed per iteration as gapped BLAST, but in many cases is much more sensitive to weak but biologically relevant sequence similarities. PSI-BLAST is used to uncover several new and interesting members of the BRCT superfamily.
References
- J Mol Biol. 1989 Jun 20;207(4):647-53
- J Mol Biol. 1994 Mar 4;236(4):1067-78
- Nat Genet. 1996 Dec;14(4):430-40
- Nucleic Acids Res. 1997 Jan 1;25(1):31-6
- Virology. 1986 Dec;155(2):418-33
- Proc Int Conf Intell Syst Mol Biol. 1993;1:47-55
- J Mol Biol. 1983 Sep 5;169(1):15-30
- J Mol Biol. 1982 Dec 15;162(3):705-8
- Methods Enzymol. 1996;266:460-80
- Proc Natl Acad Sci U S A. 1988 Apr;85(8):2444-8
- Proc Natl Acad Sci U S A. 1990 Mar;87(6):2264-8
- Proc Natl Acad Sci U S A. 1994 Dec 6;91(25):12091-5
- Comput Appl Biosci. 1996 Aug;12(4):327-45
- Proc Natl Acad Sci U S A. 1987 Jul;84(13):4355-8
- DNA Res. 1996 Oct 31;3(5):321-9, 341-54
- Comput Appl Biosci. 1995 Oct;11(5):543-51
- Comput Appl Biosci. 1996 Apr;12(2):135-43
- J Mol Biol. 1986 Apr 5;188(3):415-31
- J Mol Biol. 1987 Apr 5;194(3):557-64
- Comput Appl Biosci. 1988 Mar;4(1):67-71
- Proc Natl Acad Sci U S A. 1991 Oct 15;88(20):8880-4
- Comput Appl Biosci. 1988 Mar;4(1):11-7
- Nucleic Acids Res. 1984 Jan 11;12(1 Pt 2):505-19
- J Bacteriol. 1989 Dec;171(12):6437-45
- J Mol Evol. 1993 Mar;36(3):290-300
- Proc Natl Acad Sci U S A. 1992 Nov 15;89(22):10915-9
- Proc Natl Acad Sci U S A. 1983 Feb;80(3):726-30
- Protein Sci. 1997 Mar;6(3):698-705
- Science. 1993 Oct 8;262(5131):208-14
- Comput Appl Biosci. 1992 Oct;8(5):481-7
- Nucleic Acids Res. 1997 Jan 1;25(1):1-6
- Mol Microbiol. 1992 Oct;6(20):3051-63
- Science. 1994 Oct 7;266(5182):66-71
- DNA Seq. 1993;3(5):311-8
- J Mol Biol. 1987 Dec 20;198(4):567-77
- Proc Natl Acad Sci U S A. 1989 Feb;86(4):1183-7
- Hoppe Seylers Z Physiol Chem. 1980 Jul;361(7):1107-16
- Proteins. 1991;9(1):56-68
- Proc Natl Acad Sci U S A. 1983 Mar;80(5):1382-6
- J Mol Biol. 1987 Oct 20;197(4):723-8
- J Mol Biol. 1990 Oct 5;215(3):403-10
- J Mol Biol. 1994 Nov 4;243(4):574-8
- J Mol Biol. 1990 Dec 20;216(4):813-8
- Proc Natl Acad Sci U S A. 1993 Jun 15;90(12):5873-7
- Bull Math Biol. 1986;48(5-6):633-60
- FEBS Lett. 1997 Jan 2;400(1):25-30
- J Mol Biol. 1991 Jun 5;219(3):555-65
- J Mol Biol. 1987 Feb 20;193(4):723-50
- J Comput Biol. 1995 Spring;2(1):9-23
- Comput Chem. 1996 Mar;20(1):3-23
- FEBS Lett. 1975 Mar 1;51(1):33-7
- FASEB J. 1997 Jan;11(1):68-76
- Cell. 1980 Mar;19(3):683-96
- J Mol Biol. 1986 Mar 20;188(2):233-58
- Nucleic Acids Res. 1985 Jan 25;13(2):645-56
- Protein Sci. 1994 Aug;3(8):1315-28
- J Mol Biol. 1970 Mar;48(3):443-53
- Comput Appl Biosci. 1994 Jun;10(3):301-7
- DNA Res. 1996 Jun 30;3(3):109-36
- Nat Genet. 1996 Jul;13(3):266-8
- Cell. 1996 Feb 23;84(4):587-97
- Bull Math Biol. 1986;48(5-6):603-16
- Nat Genet. 1994 Feb;6(2):119-29
- Genes Dev. 1996 Dec 15;10(24):3141-55
- Science. 1996 Aug 23;273(5278):1058-73
- Structure. 1997 Feb 15;5(2):165-71
- J Mol Biol. 1981 Mar 25;147(1):195-7
- Nature. 1994 Mar 3;368(6466):32-8
- Comput Appl Biosci. 1994 Feb;10(1):19-29
Source: PubMed