Skip Navigation

This Article
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow FREE Full Text (Screen PDF)
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (2)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Kreil, D. P.
Right arrow Articles by Ouzounis, C. A.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Kreil, D. P.
Right arrow Articles by Ouzounis, C. A.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Bioinformatics Vol. 19 no. 13 2003
Pages 1672-1681
© 2003 Oxford University Press

Comparison of sequence masking algorithms and the detection of biased protein sequence regions

David P. Kreil 1,2,* and Christos A. Ouzounis 2

1 Department of Genetics/Inference Group (Cavendish Laboratory), University of Cambridge, Cambridge, UK and 2 Computational Genomics Group, The European Bioinformatics Institute, EMBL Outstation Cambridge CB10 1SD, UK

Received on October 25, 2002 ; revised on February 7, 2003 ; accepted on March 4, 2003

Motivation: Separation of protein sequence regions according to their local information complexity and subsequent masking of low complexity regions has greatly enhanced the reliability of function prediction by sequence similarity. Comparisons with alternative methods that focus on compositional sequence bias rather than information complexity measures have shown that removal of compositional bias yields at least as sensitive and much more specific results. Besides the application of sequence masking algorithms to sequence similarity searches, the study of the masked regions themselves is of great interest. Traditionally, however, these have been neglected despite evidence of their functional relevance.

Results: Here we demonstrate that compositional bias seems to be a more effective measure for the detection of biologically meaningful signals. Typical results on proteins are compared to results for sequences that have been randomized in various ways, conserving composition and local correlations for individual proteins or the entire set. It is remarkable that low-complexity regions have the same form of distribution in proteins as in randomized sequences, and that the signal from randomized sequences with conserved local correlations and amino acid composition almost matches the signal from proteins. This is not the case for sequence bias, which hence seems to be a genuinely biological phenomenon in contrast to patches of low complexity.

Availability: Software in executable form is available on request from the authors.

Supplementary information: There is an online supplement with additional supporting figures. (http://www.inference.phy.cam.ac.uk/dpk20/sup/)

Contact: kreil{at}ebi.ac.uk

* To whom correspondence should be addressed at: Computational Genomics Group, The European Bioinformatics Institute, EMBL Outstation Cambridge CB10 1SD, UK


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
BioinformaticsHome page
I. B. Kuznetsov
ProBias: a web-server for the identification of user-specified types of compositionally biased segments in protein sequences
Bioinformatics, July 1, 2008; 24(13): 1534 - 1535.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
I. B. Kuznetsov and S. Hwang
A novel sensitive method for the detection of user-defined compositional bias in biological sequences
Bioinformatics, May 1, 2006; 22(9): 1055 - 1063.
[Abstract] [Full Text] [PDF]



Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.