Skip Navigation

This Article
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (36)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Park, J.
Right arrow Articles by Teichmann, S. A.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Park, J.
Right arrow Articles by Teichmann, S. A.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Bioinformatics, Vol 14, 144-150, Copyright © 1998 by Oxford University Press


ARTICLES

DIVCLUS: an automatic method in the GEANFAMMER package that finds homologous domains in single- and multi-domain proteins

J Park and SA Teichmann
MRC Laboratory of Molecular Biology, Cambridge, UK.

MOTIVATION: Large-scale determination of relationships between the proteins produced by genome sequences is now common. All protein sequences are matched and those that have high match scores are clustered into families. In cases where the proteins are built of several domains or duplication modules, this can lead to misleading results. Consider the very simple example of three proteins: 1, formed by duplication modules A and B; 2, formed by duplication modules B' and C; and 3, formed by duplication modules C' and D. Duplication modules B and B' are homologous, as are C and C'. Matching the sequences of 1, 2 and 3 followed by simple single-linkage clustering would put all three in the same family, even though proteins 1 and 3 are not related. This is because the different parts of 2 match 1 and 3. This paper describes a procedure, DIVCLUS, that divides such complex clusters of partially related sequences into simple clusters that contain only related duplication modules. In the example just given, it would produce two groups of sequences: the first with domains B of sequence 1 and B of sequence 2, and the second with domain C of sequence 2 and C of sequence 3. DIVCLUS is part of a package called GEANFAMMER, for GEnome ANalysis and protein FAMily MakER. The package automates the detection of families of duplication modules from a protein sequence database. RESULTS: DIVCLUS has been applied to the division of single-linkage clusters generated from the protein sequences of six completely sequenced bacterial genomes. Out of 12 013 genes in these six genomes, 4563 single- and multi-domain sequences formed 1071 complex clusters. Application of the DIVCLUS program resolved these clusters into 2113 clusters corresponding to single duplication modules. AVAILABILITY: The perl5 program and its documentation are available at the following address: http://www.mrc-lmb.cam.ac.uk/genomes/ and by anonymous ftp at ftp.mrc-lmb.cam.ac.uk in the directory /pub/genomes/Software/. CONTACT: sat@mrc-lmb.cam.ac.uk; jong@mrc-lmb. cam.ac.uk
Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Nucleic Acids ResHome page
J. Cheng
DOMAC: an accurate, hybrid protein domain prediction server
Nucleic Acids Res., July 13, 2007; 35(suppl_2): W354 - W356.
[Abstract] [Full Text] [PDF]


Home page
Protein Sci.Home page
A. Oberai, Y. Ihm, S. Kim, and J. U. Bowie
A limited universe of membrane protein families and folds.
Protein Sci., July 1, 2006; 15(7): 1723 - 1734.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
R. L. Marsden, D. Lee, M. Maibaum, C. Yeats, and C. A. Orengo
Comprehensive genome analysis of 203 genomes provides structural genomics with new insights into protein family space
Nucleic Acids Res., February 15, 2006; 34(3): 1066 - 1080.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
J. E. Gewehr and R. Zimmer
SSEP-Domain: protein domain prediction by alignment of secondary structure elements and profiles
Bioinformatics, January 15, 2006; 22(2): 181 - 187.
[Abstract] [Full Text] [PDF]


Home page
Protein Sci.Home page
O. V. Galzitskaya and B. S. Melnik
Prediction of protein domain boundaries from sequence alone
Protein Sci., April 1, 2003; 12(4): 696 - 701.
[Abstract] [Full Text] [PDF]


Home page
Plant Physiol.Home page
Y.-L. Xiao, M. Malik, C. A. Whitelaw, and C. D. Town
Cloning and Sequencing of cDNAs for Hypothetical Genes from Chromosome 2 of Arabidopsis
Plant Physiology, December 1, 2002; 130(4): 2118 - 2128.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
O. Jardine, J. Gough, C. Chothia, and S. A. Teichmann
Comparison of the Small Molecule Metabolic Enzymes of Escherichia coli and Saccharomyces cerevisiae
Genome Res., June 1, 2002; 12(6): 916 - 929.
[Abstract] [Full Text] [PDF]


Home page
Protein Eng Des SelHome page
D. Frishman
Knowledge-based selection of targets for structural genomics
Protein Eng. Des. Sel., March 1, 2002; 15(3): 169 - 183.
[Abstract] [Full Text] [PDF]


Home page
J. Virol.Home page
C. L. Afonso, E. R. Tulman, Z. Lu, L. Zsak, F. A. Osorio, C. Balinsky, G. F. Kutish, and D. L. Rock
The Genome of Swinepox Virus
J. Virol., January 15, 2002; 76(2): 783 - 790.
[Abstract] [Full Text] [PDF]


Home page
J. Virol.Home page
E. R. Tulman, C. L. Afonso, Z. Lu, L. Zsak, G. F. Kutish, and D. L. Rock
Genome of Lumpy Skin Disease Virus
J. Virol., August 1, 2001; 75(15): 7122 - 7130.
[Abstract] [Full Text]


Home page
Nucleic Acids ResHome page
S. Balasubramanian, T. Schneider, M. Gerstein, and L. Regan
Proteomics of Mycoplasma genitalium: identification and characterization of unannotated and atypical proteins in a small model genome
Nucleic Acids Res., August 15, 2000; 28(16): 3075 - 3082.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
S. A. Teichmann, J. Park, and C. Chothia
Structural assignments to the Mycoplasma genitalium proteins show extensive gene duplications and domain rearrangements
PNAS, December 8, 1998; 95(25): 14658 - 14663.
[Abstract] [Full Text] [PDF]



Disclaimer:
Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.