Skip Navigation



Bioinformatics Advance Access published online on February 22, 2005

Bioinformatics, doi:10.1093/bioinformatics/bti335
This Article
Right arrow Advance Access manuscript (PDF) Freely available
Right arrow All Versions of this Article:
21/10/2230    most recent
bti335v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Beiko, R. G.
Right arrow Articles by Ragan, M. A.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Beiko, R. G.
Right arrow Articles by Ragan, M. A.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author (2005). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oupjournals.org
Received July 14, 2004
Revised January 27, 2005
Accepted February 16, 2005

Article

A word-oriented approach to alignment validation

Robert G. Beiko 1, Cheong Xin Chan 1, and Mark A. Ragan 1*

1 Institute for Molecular Bioscience, ARC Centre in Bioinformatics, The University of Queensland, Brisbane, Australia

* To whom correspondence should be addressed.
Mark A. Ragan, E-mail: m.ragan{at}imb.uq.edu.au


   Abstract

Motivation: Multiple sequence alignment at the level of whole proteomes requires a high degree of automation, precluding the use of traditional validation methods such as manual curation. Since evolutionary models are too general to describe the history of each residue in a protein family, there is no single algorithm/model combination that can yield a biologically or evolutionarily optimal alignment. We propose a "shotgun" strategy where many different algorithms are used to align the same family, and the best of these alignments is then chosen with a reliable objective function. We present WOOF, a novel "word-oriented objective function" that relies on the identification and scoring of conserved amino acid patterns (words) between pairs of sequences.

Results: Tests on a subset of reference protein alignments from BAliBASE showed that WOOF tended to rank the (manually curated) reference alignment highest among 1060 alternative (automatically generated) alignments for a majority of protein families. Among the automated alignments, there was a strong positive relationship between the WOOF score and similarity to the reference alignment. The speed of WOOF and its independence from explicit considerations of three-dimensional structure make it an excellent tool for analysing large numbers of protein families.

Availability: On request from the authors.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Brief BioinformHome page
K. Burrage, L. Hood, and M. A. Ragan
Advanced computing for systems biology
Brief Bioinform, December 1, 2006; 7(4): 390 - 398.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
R. G. Beiko, T. J. Harlow, and M. A. Ragan
Highways of gene sharing in prokaryotes
PNAS, October 4, 2005; 102(40): 14332 - 14337.
[Abstract] [Full Text] [PDF]



Disclaimer:
Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.