Skip Navigation

This Article
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow FREE Full Text (Screen PDF)
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (26)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Heger, A.
Right arrow Articles by Holm, L.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Heger, A.
Right arrow Articles by Holm, L.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Bioinformatics Vol. 17 no. 3 2001
Pages 272-279
© 2001 Oxford University Press


Original Paper

Picasso: generating a covering set of protein family profiles

Andreas Heger and Liisa Holm

Structural Genomics Group, EMBL-EBI, Cambridge CB10 1SD, UK

Received on December 16, 1999 ; revised on October 31, 2000 ; accepted on November 6, 2000

Motivation: Evolutionary classification leads to an economical description of protein sequence data because attributes of function and structure are inherited in protein families. This paper presents Picasso, a procedure for deriving a minimal set of protein family profiles that cover all known protein sequences.

Results: Picasso starts from highly overlapping sequence neighbourhoods revealed by all-on-all pairwise Blast alignment. Overlaps are reduced by merging sequences or parts of sequences into multiple alignments. For maximum unification, the multiple alignments must reach into the twilight zone of sequence similarity. Sensitive and selective profile–profile comparison allows unification down to about 15% pairwise sequence identity. Families unified through a short conserved sequence motif are associated with multiple full-length alignments describing different subfamilies. Domains that are mobile modules are identified based on their association with different sets of neighbours. The result is 10000 unified domain families (excluding singletons) representing functionally related proteins and recovering classical prolific domain types in high numbers. The classification is useful, for example, in developing strategies for efficient database searching and for selecting targets to complete the map of all 3-D structures.

Availability: http://www.embl-ebi.ac.uk/picasso/picasso.html

Contact: {heger,holm}@embl-ebi.ac.uk


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
BioinformaticsHome page
C. Kauffman and G. Karypis
LIBRUS: combined machine learning and homology information for sequence-based ligand-binding residue prediction
Bioinformatics, December 1, 2009; 25(23): 3099 - 3107.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
J. H. Fong and A. Marchler-Bauer
CORAL: aligning conserved core regions across domain families
Bioinformatics, August 1, 2009; 25(15): 1862 - 1868.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
H. Rangwala and G. Karypis
Incremental window-based protein sequence alignment algorithms
Bioinformatics, January 15, 2007; 23(2): e17 - e23.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
D. Chivian and D. Baker
Homology modeling using parametric alignment ensemble generation with consensus and energy-based model selection
Nucleic Acids Res., October 18, 2006; 34(17): e112 - e112.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
H. Rangwala and G. Karypis
Profile-based direct kernels for remote homology detection and fold recognition
Bioinformatics, December 1, 2005; 21(23): 4239 - 4247.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
B. Lazareva-Ulitsky, K. Diemer, and P. D. Thomas
On the quality of tree-based protein classification
Bioinformatics, May 1, 2005; 21(9): 1876 - 1890.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
K. Ginalski, N. V. Grishin, A. Godzik, and L. Rychlewski
Practical lessons from protein structure prediction
Nucleic Acids Res., April 1, 2005; 33(6): 1874 - 1891.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
K. Ginalski, M. von Grotthuss, N. V. Grishin, and L. Rychlewski
Detecting distant homology with Meta-BASIC
Nucleic Acids Res., July 1, 2004; 32(suppl_2): W576 - W581.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
O. Sasson, A. Vaaknin, H. Fleischer, E. Portugaly, Y. Bilu, N. Linial, and M. Linial
ProtoNet: hierarchical classification of the protein space
Nucleic Acids Res., January 1, 2003; 31(1): 348 - 352.
[Abstract] [Full Text] [PDF]



Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.