Bioinformatics Advance Access originally published online on January 29, 2004
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Bioinformatics 20(6) © Oxford University Press 2004; all rights reserved.
Quality of alignment comparison by COMPASS improves with inclusion of diverse confident homologs
Howard Hughes Medical Institute, and Department of Biochemistry, University of Texas Southwestern Medical Center, 5323, Harry Hines Blvd, Dallas, TX 75390-9050, USA
Received on March 19, 2003; revised on October 16, 2003; accepted on October 17, 2003
Advance Access Publication January 29, 2004
Motivation: Adding more distant homologs to a multiple alignment and thus increasing its diversity may eventually deteriorate the numerical profile constructed from this alignment. Here, we addressed the question whether such a diversity limit can be reached in the alignments of confident homologs found by PSI-BLAST, and we analyzed the dependence of the quality of the profileprofile comparison made by COMPASS on the sequence diversity within these alignments.
Results: Protein families that have a greater number of diverse confident homologs in the current sequence databases provide an increased quality of similarity detection in profile databases, but produce on average less accurate profileprofile alignments with their remote relatives. This lower alignment accuracy cannot be improved when the most distant members of these families are excluded from their profiles. On the contrary, the presence of more diverse members results in more accurate alignments. For families with a high diversity of confident homologs, the lower quality of profile alignments with their remote relatives seems to be an attribute of these families or their alignments, rather than to be caused by the large number of diverse sequences itself. Our results suggest that at any level of profile diversity, one should include in the multiple alignment as many confident sequence homologs as possible in order to produce the most accurate results.
Contact: grishin{at}chop.swmed.edu
* To whom correspondence should be addressed.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
Y. Loewenstein and M. Linial Connect the dots: exposing hidden protein family connections from the entire sequence tree Bioinformatics, August 15, 2008; 24(16): i193 - i199. [Abstract] [PDF] |
||||
![]() |
A. Poleksic and M. Fienup Optimizing the size of the sequence profiles to increase the accuracy of protein sequence alignments generated by profile-profile algorithms Bioinformatics, May 1, 2008; 24(9): 1145 - 1153. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Heger, S. Mallick, C. Wilton, and L. Holm The global trace graph, a novel paradigm for searching protein sequence databases Bioinformatics, September 15, 2007; 23(18): 2361 - 2367. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Soding, M. Remmert, A. Biegert, and A. N. Lupas HHsenser: exhaustive transitive profile search using HMM-HMM comparison. Nucleic Acids Res., July 1, 2006; 34(Web Server issue): W374 - W378. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Frenkel-Morgenstern, H. Voet, and S. Pietrokovski Enhanced statistics for local alignment of multiple alignments improves prediction of protein function and structure Bioinformatics, July 1, 2005; 21(13): 2950 - 2956. [Abstract] [Full Text] [PDF] |
||||

