Bioinformatics Advance Access originally published online on February 5, 2004
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Bioinformatics 20(7) © Oxford University Press 2004; all rights reserved.
Statistically rigorous automated protein annotation
1 San Diego Supercomputer Center, San Diego, USA and, 2 Department of Pharmacology, University of California, San Diego, 9500 Gilman Drive, La Jolla, CA 92093-0505, USA
Received on March 3, 2003; revised on November 1, 2003; accepted on November 13, 2003
Advance Access Publication February 5, 2004
Motivation: Assignment of putative protein functional annotation by comparative analysis using pre-defined experimental annotations is performed routinely by molecular biologists. The number and statistical significance of these assignments remains a challenge in this era of high-throughput proteomics. A combined statistical method that enables robust, automated protein annotation by reliably expanding existing annotation sets is described. An existing clustering scheme, based on relevant experimental information (e.g. sequence identity, keywords or gene expression data) is required. The method assigns new proteins to these clusters with a measure of reliability. It can also provide human reviewers with a reliability score for both new and previously classified proteins.
Results: A dataset of 27 000 annotated Protein Data Bank (PDB) polypeptide chains (of 36 000 chains currently in the PDB) was generated from 23 000 chains classified a priori.
Availability: PDB annotations and sample software implementation are freely accessible on the Web at http://pmr.sdsc.edu/go
Contact: bourne{at}sdsc.edu
* To whom correspondence should be addressed.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
I. V. Tetko, I. V. Rodchenkov, M. C. Walter, T. Rattei, and H.-W. Mewes Beyond the 'best' match: machine learning annotation of protein sequences by integration of different sources of information Bioinformatics, March 1, 2008; 24(5): 621 - 628. [Abstract] [Full Text] [PDF] |
||||
