Skip Navigation

This Article
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow FREE Full Text (Screen PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (6)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Bachinsky, A. G.
Right arrow Articles by Yarigin, A. A.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Bachinsky, A. G.
Right arrow Articles by Yarigin, A. A.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Bioinformatics Vol. 16 no. 4 2000
Pages 358-366
© 2000 Oxford University Press

PROF_ PAT 1.3: Updated database of patterns used to detect local similarities

A. G. Bachinsky 1,*, A. S. Frolov 2, A. N. Naumochkin 1, L. Ph. Nizolenko 1 and A. A. Yarigin 1

1 Theoretical Department, Research Institute of Molecular Biology, SRC VB ‘Vector’,Koltsovo, Novosibirsk region, 630559, Russia
2 Institute of Cytology and Genetics SB RAS, Novosibirsk, 630090, Russia

Received on June 2, 1999 ; revised on September 29, 1999 ; accepted on October 20, 1999

*To whom correspondence should be addressed.

Motivation: When analysing novel protein sequences, it is now essential to extend search strategies to include a range of ‘secondary’ databases. Pattern databases have become vital tools for identifying distant relationships in sequences, and hence for predicting protein function and structure. The main drawback of such methods is the relatively small representation of proteins in trial samples at the time of their construction. Therefore, a negative result of an amino acid sequence comparison with such a databank forces a researcher to search for similarities in the original protein banks. We developed a database of patterns constructed for groups of related proteins with maximum representation of amino acid sequences of SWISS-PROT in the groups.

Results: Software tools and a new method have been designed to construct patterns of protein families. By using such method, a new version of databank of protein family patterns, PROF_ PAT 1.3, is produced. This bank is based on SWISS-PROT (r1.38) and TrEMBL (r1.11), and contains patterns of more than 13 000 groups of related proteins in a format similar to that of the PROSITE. Motifs of patterns, which had the minimum level of probability to be found in random sequences, were selected. Flexible fast search program accompanies the bank. The researcher can specify a similarity matrix (the type PAM, BLOSUM and other). Variable levels of similarity can be set (permitting search strategies ranging from exact matches to increasing levels of ‘fuzziness’).

Availability: The Internet address for comparing sequences with the bank is: http://wwwmgs.bionet.nsc.ru/mgs/programs/prof_pat/. The local version of the bank and search programs (approximately 50 Mb) is available via ftp: ftp://ftp.bionet.nsc.ru/pub/biology/vector/prof_pat/, and ftp://ftp.ebi.ac.uk/pub/databases/prof_pat/. Another appropriate way for its external use is to mail amino acid sequences to bachin@vector.nsc.ru for comparison with PROF_ PAT 1.3.

Contact: bachin{at}vector.nsc.ru


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?




Disclaimer:
Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.