Skip Navigation

This Article
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow FREE Full Text (Screen PDF)
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by A.Schäffer, A.
Right arrow Articles by F.Altschul, S.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by A.Schäffer, A.
Right arrow Articles by F.Altschul, S.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Bioinformatics Vol. 15 no. 12 1999
Pages 1000-1011
© 1999 Oxford University Press

IMPALA: matching a protein sequence against a collection of PSI-BLAST-constructed position-specific score matrices

Alejandro A.Schäffer 1, Yuri I.Wolf 1, Chris P.Ponting 1, Eugene V.Koonin 1, L. Aravind 2 and Stephen F.Altschul 1

1 National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
2 Department of Biology, Texas A&M University, Biological Sciences Building West, College Station, TX 77843, USA

Present address: MRC Functional Genetics Unit, Department of Human Anatomy and Genetics, University of Oxford, South Parks Road, Oxford OX1 3QX, UK.

Motivation: Many studies have shown that database searches using position-specific score matrices (PSSMs) or profiles as queries are more effective at identifying distant protein relationships than are searches that use simple sequences as queries. One popular program for constructing a PSSM and comparing it with a database of sequences is Position-Specific Iterated BLAST (PSI-BLAST).

Results: This paper describes a new software package, IMPALA, designed for the complementary procedure of comparing a single query sequence with a database of PSI-BLAST-generated PSSMs. We illustrate the use of IMPALA to search a database of PSSMs for protein folds, and one for protein domains involved in signal transduction. IMPALA’s sensitivity to distant biological relationships is very similar to that of PSI-BLAST. However, IMPALA employs a more refined analysis of statistical significance and, unlike PSI-BLAST, guarantees the output of the optimal local alignment by using the rigorous Smith–Waterman algorithm. Also, it is considerably faster when run with a large database of PSSMs than is BLAST or PSI-BLAST when run against the complete non-redundant protein database.

Availability: The IMPALA source code, the wolf1187 database, and the aravind105 database are freely available from the NCBI ftp site ncbi.nlm.nih.gov. The databases may be found in the subdirectory ftp://ncbi.nlm.nih.gov/pub/impala. The source code is in ftp://ncbi.nlm.nih.gov/toolbox/ncbi·tools. Some IMPALA executables for different implementations of UNIX are in ftp://ncbi.nlm.nih.gov/blast/executables. IMPALA has been added as a search option on the Blocks Database Server (http://blocks.fhcrc.org/blocks/impala.html)using a library of PSSMs derived from the BLOCKS database.

Contact: schaffer{at}helix.nih.gov

Received on March 19, 1999 ; revised on July 28, 1999 ; accepted on August 4, 1999

Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Nucleic Acids ResHome page
H. S. Ooi, C. Y. Kwo, M. Wildpaner, F. L. Sirota, B. Eisenhaber, S. Maurer-Stroh, W. C. Wong, A. Schleiffer, F. Eisenhaber, and G. Schneider
ANNIE: integrated de novo protein sequence annotation
Nucleic Acids Res., July 1, 2009; 37(suppl_2): W435 - W440.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
Y. Wang, R. I. Sadreyev, and N. V. Grishin
PROCAIN: protein profile comparison with assisting information
Nucleic Acids Res., June 1, 2009; 37(11): 3522 - 3530.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
S. F. Altschul, E. M. Gertz, R. Agarwala, A. A. Schaffer, and Y.-K. Yu
PSI-BLAST pseudocounts and the minimum description length principle
Nucleic Acids Res., February 1, 2009; 37(3): 815 - 824.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
D. Przybylski and B. Rost
Powerful fusion: PSI-BLAST and consensus sequences
Bioinformatics, September 15, 2008; 24(18): 1987 - 1993.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
M. G. Kann, S. L. Sheetlin, Y. Park, S. H. Bryant, and J. L. Spouge
The identification of complete domains within protein sequences using accurate E-values for semi-global alignment
Nucleic Acids Res., July 9, 2007; 35(14): 4678 - 4685.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
D. Przybylski and B. Rost
Consensus sequences improve PSI-BLAST through mimicking profile-profile alignments
Nucleic Acids Res., April 1, 2007; 35(7): 2238 - 2246.
[Abstract] [Full Text] [PDF]


Home page
Brief BioinformHome page
P. Fariselli, I. Rossi, E. Capriotti, and R. Casadio
The WWWH of remote homolog detection: The state of the art
Brief Bioinform, March 1, 2007; 8(2): 78 - 87.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
Y.-K. Yu, E. M. Gertz, R. Agarwala, A. A. Schaffer, and S. F. Altschul
Retrieval accuracy, statistical significance and compositional similarity in protein sequence database searches
Nucleic Acids Res., November 6, 2006; 34(20): 5966 - 5973.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
J.-M. Yang and C.-H. Tung
Protein structure database search and evolutionary classification
Nucleic Acids Res., August 2, 2006; 34(13): 3646 - 3659.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
C.-C. Chen, J.-K. Hwang, and J.-M. Yang
(PS)2: protein structure prediction server.
Nucleic Acids Res., July 1, 2006; 34(Web Server issue): W152 - W157.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
J. Cheng and P. Baldi
A machine learning information retrieval approach to protein fold recognition
Bioinformatics, June 15, 2006; 22(12): 1456 - 1463.
[Abstract] [Full Text] [PDF]


Home page
Brief BioinformHome page
A. Yu. Mitrophanov and M. Borodovsky
Statistical significance in biological sequence analysis
Brief Bioinform, March 1, 2006; 7(1): 2 - 24.



Home page
BioinformaticsHome page
L. M. Iyer, A. M. Burroughs, and L. Aravind
The ASCH superfamily: novel domains with a fold related to the PUA domain and a potential role in RNA metabolism
Bioinformatics, February 1, 2006; 22(3): 257 - 263.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
U. Pieper, N. Eswar, F. P. Davis, H. Braberg, M. S. Madhusudhan, A. Rossi, M. Marti-Renom, R. Karchin, B. M. Webb, D. Eramian, et al.
MODBASE: a database of annotated comparative protein structure models and associated resources
Nucleic Acids Res., January 1, 2006; 34(suppl_1): D291 - D295.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
G. A. Price, G. E. Crooks, R. E. Green, and S. E. Brenner
Statistical evaluation of pairwise protein sequence comparison with the Bayesian bootstrap
Bioinformatics, October 15, 2005; 21(20): 3824 - 3831.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
R. Karchin, M. Diekhans, L. Kelly, D. J. Thomas, U. Pieper, N. Eswar, D. Haussler, and A. Sali
LS-SNP: large-scale annotation of coding non-synonymous SNPs based on multiple information sources
Bioinformatics, June 15, 2005; 21(12): 2814 - 2820.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
V. Freschi and A. Bogliolo
Using sequence compression to speedup probabilistic profile matching
Bioinformatics, May 15, 2005; 21(10): 2225 - 2229.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
M. G. Kann, P. A. Thiessen, A. R. Panchenko, A. A. Schaffer, S. F. Altschul, and S. H. Bryant
A structure-based method for protein sequence alignment
Bioinformatics, April 15, 2005; 21(8): 1451 - 1456.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
V. A. Simossis, J. Kleinjung, and J. Heringa
Homology-extended sequence alignment
Nucleic Acids Res., February 7, 2005; 33(3): 816 - 824.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
Q. J. Su, L. Lu, S. Saxonov, and D. L. Brutlag
eBLOCKs: enumerating conserved protein blocks to achieve maximal sensitivity and specificity
Nucleic Acids Res., January 1, 2005; 33(suppl_1): D178 - D182.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
G. Pugalenthi, A. Bhaduri, and R. Sowdhamini
GenDiS: Genomic Distribution of protein structural domain Superfamilies
Nucleic Acids Res., January 1, 2005; 33(suppl_1): D252 - D255.
[Abstract] [Full Text] [PDF]


Home page
Mol. Cell. Biol.Home page
C. Jeronimo, M.-F. Langelier, M. Zeghouf, M. Cojocaru, D. Bergeron, D. Baali, D. Forget, S. Mnaimneh, A. P. Davierwala, J. Pootoolal, et al.
RPAP1, a Novel Human RNA Polymerase II-Associated Protein Affinity Purified with Recombinant Wild-Type and Mutated Polymerase Subunits
Mol. Cell. Biol., August 15, 2004; 24(16): 7043 - 7058.
[Abstract] [Full Text] [PDF]


Home page
J. Cell Sci.Home page
A. Lorenz, J. L. Wells, D. W. Pryce, M. Novatchkova, F. Eisenhaber, R. J. McFarlane, and J. Loidl
S. pombe meiotic linear elements contain proteins related to synaptonemal complex components
J. Cell Sci., July 1, 2004; 117(15): 3343 - 3351.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
E. Kolker, K. S. Makarova, S. Shabalina, A. F. Picone, S. Purvine, T. Holzman, T. Cherny, D. Armbruster, R. S. Munson Jr, G. Kolesov, et al.
Identification and functional analysis of 'hypothetical' genes expressed in Haemophilus influenzae
Nucleic Acids Res., April 30, 2004; 32(8): 2353 - 2361.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
U. Pieper, N. Eswar, H. Braberg, M. S. Madhusudhan, F. P. Davis, A. C. Stuart, N. Mirkovic, A. Rossi, M. A. Marti-Renom, A. Fiser, et al.
MODBASE, a database of annotated comparative protein structure models, and associated resources
Nucleic Acids Res., January 1, 2004; 32(90001): D217 - 222.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
K. Fleming, A. Muller, R. M. MacCallum, and M. J. E. Sternberg
3D-GENOMICS: a database to compare structural and functional annotations of proteins between sequenced genomes
Nucleic Acids Res., January 1, 2004; 32(90001): D245 - 250.
[Abstract] [Full Text] [PDF]


Home page
Protein Eng Des SelHome page
A. Bhaduri and R. Sowdhamini
A genome-wide survey of human tyrosine phosphatases
Protein Eng. Des. Sel., December 1, 2003; 16(12): 881 - 888.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
A. J. Enright, V. Kunin, and C. A. Ouzounis
Protein families and TRIBES in genome sequence space
Nucleic Acids Res., August 1, 2003; 31(15): 4632 - 4638.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
B. John and A. Sali
Comparative protein structure modeling by iterative alignment, model building and model assessment
Nucleic Acids Res., July 15, 2003; 31(14): 3982 - 3992.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
N. Eswar, B. John, N. Mirkovic, A. Fiser, V. A. Ilyin, U. Pieper, A. C. Stuart, M. A. Marti-Renom, M. S. Madhusudhan, B. Yerkovich, et al.
Tools for comparative protein structure modeling and analysis
Nucleic Acids Res., July 1, 2003; 31(13): 3375 - 3380.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
G. Mannhaupt, C. Montrone, D. Haase, H. W. Mewes, V. Aign, J. D. Hoheisel, B. Fartmann, G. Nyakatura, F. Kempken, J. Maier, et al.
What's in the genome of a filamentous fungus? Analysis of the Neurospora genome sequence
Nucleic Acids Res., April 1, 2003; 31(7): 1944 - 1954.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
D. Frishman, M. Mokrejs, D. Kosykh, G. Kastenmuller, G. Kolesov, I. Zubrzycki, C. Gruber, B. Geier, A. Kaps, K. Albermann, et al.
The PEDANT genome database
Nucleic Acids Res., January 1, 2003; 31(1): 207 - 211.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
V. S. Gowri, S. B. Pandit, P. S. Karthik, N. Srinivasan, and S. Balaji
Integration of related sequences with protein three-dimensional structural families in an updated version of PALI database
Nucleic Acids Res., January 1, 2003; 31(1): 486 - 488.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
R. Mazumder, L. M. Iyer, S. Vasudevan, and L. Aravind
Detection of novel members, structure-function analysis and evolutionary classification of the 2H phosphoesterase superfamily
Nucleic Acids Res., December 1, 2002; 30(23): 5229 - 5243.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
A. Muller, R. M. MacCallum, and M. J.E. Sternberg
Structural Characterization of the Human Proteome
Genome Res., November 1, 2002; 12(11): 1625 - 1641.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
R. Holzerlandt, C. Orengo, P. Kellam, and M. M. Alba
Identification of New Herpesvirus Gene Homologs in the Human Genome
Genome Res., November 1, 2002; 12(11): 1739 - 1748.
[Abstract] [Full Text] [PDF]


Home page
Protein Eng Des SelHome page
S. Gupta, S. B. Pandit, N. Srinivasan, and D. Chatterji
Proteomics analysis of carbon-starved Mycobacterium smegmatis: induction of Dps-like protein
Protein Eng. Des. Sel., June 1, 2002; 15(6): 503 - 511.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
V. Anantharaman, E. V. Koonin, and L. Aravind
Comparative genomics and evolution of proteins involved in RNA metabolism
Nucleic Acids Res., April 1, 2002; 30(7): 1427 - 1464.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
D. W.A. Buchan, A. J. Shepherd, D. Lee, F. M.G. Pearl, S. C.G. Rison, J. M. Thornton, and C. A. Orengo
Gene3D: Structural Assignment for Whole Genes and Genomes Using the CATH Domain Structure Database
Genome Res., March 1, 2002; 12(3): 503 - 514.
[Abstract] [Full Text] [PDF]


Home page
Protein Eng Des SelHome page
D. Frishman
Knowledge-based selection of targets for structural genomics
Protein Eng. Des. Sel., March 1, 2002; 15(3): 169 - 183.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
U. Pieper, N. Eswar, A. C. Stuart, V. A. Ilyin, and A. Sali
MODBASE, a database of annotated comparative protein structure models
Nucleic Acids Res., January 1, 2002; 30(1): 255 - 259.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
S. B. Pandit, D. Gosar, S. Abhiman, S. Sujatha, S. S. Dixit, N. S. Mhatre, R. Sowdhamini, and N. Srinivasan
SUPFAM--a database of potential protein superfamily relationships derived by comparing sequence-based and structure-based families: implications for structural genomics and function annotation in genomes
Nucleic Acids Res., January 1, 2002; 30(1): 289 - 293.
[Abstract] [Full Text] [PDF]


Home page
J. Bacteriol.Home page
J. Nolling, G. Breton, M. V. Omelchenko, K. S. Makarova, Q. Zeng, R. Gibson, H. M. Lee, J. Dubois, D. Qiu, J. Hitti, et al.
Genome Sequence and Comparative Analysis of the Solvent-Producing Bacterium Clostridium acetobutylicum
J. Bacteriol., August 15, 2001; 183(16): 4823 - 4838.
[Abstract] [Full Text] [PDF]


Home page
Hum Mol GenetHome page
V. Anantharaman, E. V. Koonin, and L. Aravind
Peptide-N-glycanases and DNA repair proteins, Xp-C/Rad4, are, respectively, active and inactivated enzymes sharing a common transglutaminase fold
Hum. Mol. Genet., August 1, 2001; 10(16): 1627 - 1630.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
A. A. Schaffer, L. Aravind, T. L. Madden, S. Shavirin, J. L. Spouge, Y. I. Wolf, E. V. Koonin, and S. F. Altschul
Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements
Nucleic Acids Res., July 15, 2001; 29(14): 2994 - 3005.
[Abstract] [Full Text] [PDF]


Home page
Protein Eng Des SelHome page
I. V. Grigoriev, C. Zhang, and S.-H. Kim
Sequence-based detection of distantly related proteins with the same fold
Protein Eng. Des. Sel., July 1, 2001; 14(7): 455 - 458.
[Full Text] [PDF]


Home page
Microbiol. Mol. Biol. Rev.Home page
K. S. Makarova, L. Aravind, Y. I. Wolf, R. L. Tatusov, K. W. Minton, E. V. Koonin, and M. J. Daly
Genome of the Extremely Radiation-Resistant Bacterium Deinococcus radiodurans Viewed from the Perspective of Comparative Genomics
Microbiol. Mol. Biol. Rev., March 1, 2001; 65(1): 44 - 79.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
S. F. Altschul, R. Bundschuh, R. Olsen, and T. Hwa
The estimation of statistical parameters for local alignment score distributions
Nucleic Acids Res., January 15, 2001; 29(2): 351 - 361.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
M. M. Alba, D. Lee, F. M. G. Pearl, A. J. Shepherd, N. Martin, C. A. Orengo, and P. Kellam
VIDA: a virus database system for the organization of animal virus genome open reading frames
Nucleic Acids Res., January 1, 2001; 29(1): 133 - 136.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
H. S. Malik, S. Henikoff, and T. H. Eickbush
Poised for Contagion: Evolutionary Origins of the Infectious Abilities of Invertebrate Retroviruses
Genome Res., September 1, 2000; 10(9): 1307 - 1318.
[Abstract] [Full Text]



Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.