Bioinformatics Vol. 16 no. 2 2000
Pages 117-124
© 2000 Oxford University Press
Fast assignment of protein structures to sequences using the Intermediate Sequence Library PDB-ISL
1 MRC Laboratory of Molecular Biology, Hills
Road, Cambridge CB2 2QH, UK
2 Department of Genetics, Harvard Medical
School, Warren Alpert Building, 200 Longwood Avenue, Boston, MA
02115, USA
Motivation: For large-scale structural assignment to sequences, as in computational structural genomics, a fast yet sensitive sequence search procedure is essential. A new approach using intermediate sequences was tested as a shortcut to iterative multiple sequence search methods such as PSI-BLAST.
Results: A library containing potential intermediate sequences for proteins of known structure (PDB-ISL) was constructed. The sequences in the library were collected from a large sequence database using the sequences of the domains of proteins of known structure as the query sequences and the program PSI-BLAST. Sequences of proteins of unknown structure can be matched to distantly related proteins of known structure by using pairwise sequence comparison methods to find homologues in PDB-ISL. Searches of PDB-ISL were calibrated, and the number of correct matches found at a given error rate was the same as that found by PSI-BLAST. The advantage of this library is that it uses pairwise sequence comparison methods, such as FASTA or BLAST2, and can, therefore, be searched easily and, in many cases, much more quickly than an iterative multiple sequence comparison method. The procedure is roughly 20 times faster than PSI-BLAST for small genomes and several hundred times for large genomes.
Availability: Sequences can be submitted to the PDB-ISL servers at http://stash.mrc-lmb.cam.ac.uk/PDB·ISL/or http://cyrah.ebi.ac.uk:1111/Serv/PDB·ISL/and can be downloaded from ftp://stash.mrc-lmb.cam.ac.uk/pub/PDB·ISL/or ftp://ftp.ebi.ac.uk/pub/contrib/jong/PDB·ISL/
Contact: sat{at}mrc-lmb.cam.ac.ukand jong{at}ebi.ac.uk
Received on July 20, 1999
; revised on September 17, 1999
; accepted on September 23, 1999
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
A. Han, H. J. Kang, Y. Cho, S. Lee, Y. J. Kim, and S. Gong SNP@Domain: a web resource of single nucleotide polymorphisms (SNPs) within protein domain structures and sequences. Nucleic Acids Res., July 1, 2006; 34(Web Server issue): W642 - W644. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Gong, G. Yoon, I. Jang, D. Bolser, P. Dafas, M. Schroeder, H. Choi, Y. Cho, K. Han, S. Lee, et al. PSIbase: a database of Protein Structural Interactome map (PSIMAP) Bioinformatics, May 15, 2005; 21(10): 2541 - 2543. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. Li, L. Jaroszewski, and A. Godzik Sequence clustering strategies improve remote homology recognitions while reducing search times Protein Eng. Des. Sel., August 1, 2002; 15(8): 643 - 649. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Qian, B. Stenger, C. A. Wilson, J. Lin, R. Jansen, S. A. Teichmann, J. Park, W. G. Krebs, H. Yu, V. Alexandrov, et al. PartsList: a web-based system for dynamically ranking protein folds based on disparate attributes, including whole-genome expression and interaction information Nucleic Acids Res., April 15, 2001; 29(8): 1750 - 1764. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Berg, M. Edman, L. Li, M. Wikstrom, and A. Wieslander Sequence Properties of the 1,2-Diacylglycerol 3-Glucosyltransferase from Acholeplasma laidlawii Membranes. RECOGNITION OF A LARGE GROUP OF LIPID GLYCOSYLTRANSFERASES IN EUBACTERIA AND ARCHAEA J. Biol. Chem., June 15, 2001; 276(25): 22056 - 22063. [Abstract] [Full Text] [PDF] |
||||



