Skip Navigation

This Article
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow FREE Full Text (Screen PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Wieser, D.
Right arrow Articles by Apweiler, R.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Wieser, D.
Right arrow Articles by Apweiler, R.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Bioinformatics 20(Suppl. 1) © Oxford University Press 2004; all rights reserved.

Filtering erroneous protein annotation

D. Wieser , E. Kretschmann and R. Apweiler *

Sequence Database Group, European Bioinformatics Institute, Cambridge, CB10 1SD, UK

Received on January 15, 2004; accepted on March 1, 2004

Motivation: Automatically generated annotation on protein data of UniProt (Universal Protein Resource) is planned to be publicly available on the UniProt web pages in April 2004. It is expected that the data content of over 500 000 protein entries in the TrEMBL section will be enhanced by the output of an automated annotation pipeline. However, a part of the automatically added data will be erroneous, as are parts of the information coming from other sources. We present a post-processing system called Xanthippe that is based on a simple exclusion mechanism and a decision tree approach using the C4.5 data-mining algorithm.

Results: It is shown that Xanthippe detects and flags a large part of the annotation errors and considerably increases the reliability of both automatically generated data and annotation from other sources. As a cross-validation to Swiss-Prot shows, errors in protein descriptions, comments and keywords are successfully filtered out. Xanthippe is a contradictive application that can be combined seamlessly with predictive systems. It can be used either to improve the precision of automated annotation at a constant level of recall or increase the recall at a constant level of precision.

Availability: The application of the Xanthippe rules can be browsed at http://www.ebi.uniprot.org/

Contact: apweiler{at}ebi.ac.uk

* To whom correspondence should be addressed.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Nucleic Acids ResHome page
The UniProt Consortium
The Universal Protein Resource (UniProt)
Nucleic Acids Res., January 11, 2008; 36(suppl_1): D190 - D195.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
The UniProt Consortium
The Universal Protein Resource (UniProt)
Nucleic Acids Res., January 12, 2007; 35(suppl_1): D193 - D197.
[Abstract] [Full Text] [PDF]


Home page
Brief BioinformHome page
I. Friedberg
Automated protein function prediction--the genomic challenge
Brief Bioinform, September 1, 2006; 7(3): 225 - 242.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
R. Petryszak, E. Kretschmann, D. Wieser, and R. Apweiler
The predictive power of the CluSTr database
Bioinformatics, September 15, 2005; 21(18): 3604 - 3609.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
A. Bairoch, R. Apweiler, C. H. Wu, W. C. Barker, B. Boeckmann, S. Ferro, E. Gasteiger, H. Huang, R. Lopez, M. Magrane, et al.
The Universal Protein Resource (UniProt)
Nucleic Acids Res., January 1, 2005; 33(suppl_1): D154 - D159.
[Abstract] [Full Text] [PDF]



Disclaimer:
Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.