Skip Navigation

This Article
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (37)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Gracy, J.
Right arrow Articles by Argos, P.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Gracy, J.
Right arrow Articles by Argos, P.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Bioinformatics, Vol 14, 174-187, Copyright © 1998 by Oxford University Press


ARTICLES

Automated protein sequence database classification. II. Delineation Of domain boundaries from sequence similarities

J Gracy and P Argos
European Molecular Biology Laboratory, Heidelberg, Germany.

MOTIVATION: Decomposing each protein into modular domains is a basic prerequisite to classify accurately structural units in biological molecules. Boundaries between domains are indicated by two similar amino acid sequence segments located within the same protein (repeats) or within homologous proteins at notably different distances from their respective N- or C-termini. RESULTS: We have developed an automated method that combines such positional constraints derived from various detected pairwise sequence similarities to delineate the modular organization of proteins. The procedure has been applied to a non- redundant data set of 26 990 proteins whose sequences were taken from the PIR and SWISS-PROT databanks and shared <60% sequence identity amongst pairs. The resultant clustering, delineation and multiple alignment of 24 380 sequence fragments yielded a new database of 4364 domain families. Comparison of the domain collection with that of PRODOM indicates a clear improvement in the number and size of domain families, domain boundaries and multiple sequence alignments. The accuracy and sensitivity of the method are illustrated by results obtained for ankyrin-like repeats and EGF-like modules. AVAILABILITY: The resulting database, called DOMO, is available through the database search routine SRS at Infobiogen (http://www.infobiogen.fr/srs5/), EBI (http://srs.ebi.ac.uk:5000/) and EMBL (http://www.embl- heidelberg.de/srs5/) World Wide Web sites. CONTACT: gracy@infobiogen.fr
Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
BioinformaticsHome page
S. Wong and M. A. Ragan
MACHOS: Markov clusters of homologous subsequences
Bioinformatics, July 1, 2008; 24(13): i77 - i85.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
C. N.I. Pang, K. Lin, M. A. Wouters, J. Heringa, and R. A. George
Identifying foldable regions in protein sequence from the hydrophobic signal
Nucleic Acids Res., February 2, 2008; 36(2): 578 - 588.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
I. Uchiyama
Hierarchical clustering algorithm for comprehensive orthologous-domain classification in multiple genomes
Nucleic Acids Res., January 25, 2006; 34(2): 647 - 658.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
K. Bae, B. K. Mallick, and C. G. Elsik
Prediction of protein interdomain linker regions by a hidden Markov model
Bioinformatics, May 15, 2005; 21(10): 2264 - 2270.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
Q. J. Su, L. Lu, S. Saxonov, and D. L. Brutlag
eBLOCKs: enumerating conserved protein blocks to achieve maximal sensitivity and specificity
Nucleic Acids Res., January 1, 2005; 33(suppl_1): D178 - D182.
[Abstract] [Full Text] [PDF]


Home page
Protein Sci.Home page
O. V. Galzitskaya and B. S. Melnik
Prediction of protein domain boundaries from sequence alone
Protein Sci., April 1, 2003; 12(4): 696 - 701.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
Z. Bao and S. R. Eddy
Automated De Novo Identification of Repeat Sequence Families in Sequenced Genomes
Genome Res., August 1, 2002; 12(8): 1269 - 1276.
[Abstract] [Full Text] [PDF]


Home page
Protein Eng Des SelHome page
D. J. Rigden
Use of covariance analysis for the prediction of structural domain boundaries from multiple protein sequence alignments
Protein Eng. Des. Sel., February 1, 2002; 15(2): 65 - 77.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
Y. Pouliot, J. Gao, Q. J. Su, G. G. Liu, and X. B. Ling
DIAN: A Novel Algorithm for Genome Ontological Classification
Genome Res., October 1, 2001; 11(10): 1766 - 1779.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
K. A. T. Silverstein, E. Shoop, J. E. Johnson, A. Kilian, J. L. Freeman, T. M. Kunau, I. A. Awad, M. Mayer, and E. F. Retzel
The MetaFam Server: a comprehensive protein family resource
Nucleic Acids Res., January 1, 2001; 29(1): 49 - 51.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
J. Burke, D. Davison, and W. Hide
d2_cluster: A Validated Method for Clustering EST and Full-Length cDNA Sequences
Genome Res., November 1, 1999; 9(11): 1135 - 1142.
[Abstract] [Full Text]


Home page
Genome ResHome page
A. Louis, E. Ollivier, J.-C. Aude, and J.-L. Risler
Massive Sequence Comparisons as a Help in Annotating Genomic Sequences
Genome Res., July 1, 2001; 11(7): 1296 - 1303.
[Abstract] [Full Text] [PDF]



Disclaimer:
Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.