Skip Navigation

This Article
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow FREE Full Text (Screen PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (46)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Gilks, W. R.
Right arrow Articles by Ouzounis, C. A.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Gilks, W. R.
Right arrow Articles by Ouzounis, C. A.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Bioinformatics Vol. 18 no. 12 2002
Pages 1641-1649
© 2002 Oxford University Press

Modeling the percolation of annotation errors in a database of protein sequences

Walter R. Gilks 1,*,{dagger}, Benjamin Audit 2,{dagger}, Daniela De Angelis 1,3, Sophia Tsoka 2 and Christos A. Ouzounis 2

1 Medical Research Council Biostatistics Unit, Cambridge
2 Computational Genomics Group, The European Bioinformatics Institute, EMBL Cambridge Outstation, Cambridge, CB10 1SD, UK
3 Statistics Unit, Public Health Laboratory Service, London, UK

Received on April 5, 2002 ; revised on May 30, 2002 ; accepted on June 6, 2002

Public sequence databases contain information on the sequence, structure and function of proteins. Genome sequencing projects have led to a rapid increase in protein sequence information, but reliable, experimentally verified, information on protein function lags a long way behind. To address this deficit, functional annotation in protein databases is often inferred by sequence similarity to homologous, annotated proteins, with the attendant possibility of error. Now, the functional annotation in these homologous proteins may itself have been acquired through sequence similarity to yet other proteins, and it is generally not possible to determine how the functional annotation of any given protein has been acquired. Thus the possibility of chains of misannotation arises, a process we term ‘error percolation’. With some simple assumptions, we develop a dynamical probabilistic model for these misannotation chains. By exploring the consequences of the model for annotation quality it is evident that this iterative approach leads to a systematic deterioration of database quality.

Contact: WRG: wally.gilks{at}mrc-bsu.cam.ac.uk; BA and CAO: audit{at}ebi.ac.uk; ouzounis{at}ebi.ac.uk

* To whom correspondence should be addressed.

{dagger} Both these authors contributed equally to this work.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Nucleic Acids ResHome page
R. S. Datta, C. Meacham, B. Samad, C. Neyer, and K. Sjolander
Berkeley PHOG: PhyloFacts orthology group prediction web server
Nucleic Acids Res., July 1, 2009; 37(suppl_2): W84 - W89.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
H. S. Ooi, C. Y. Kwo, M. Wildpaner, F. L. Sirota, B. Eisenhaber, S. Maurer-Stroh, W. C. Wong, A. Schleiffer, F. Eisenhaber, and G. Schneider
ANNIE: integrated de novo protein sequence annotation
Nucleic Acids Res., July 1, 2009; 37(suppl_2): W435 - W440.
[Abstract] [Full Text] [PDF]


Home page
MicrobiologyHome page
V. Barbe, S. Cruveiller, F. Kunst, P. Lenoble, G. Meurice, A. Sekowska, D. Vallenet, T. Wang, I. Moszer, C. Medigue, et al.
From a consortium sequence to a unified sequence: the Bacillus subtilis 168 reference genome a decade later
Microbiology, June 1, 2009; 155(6): 1758 - 1775.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
M. F. Rogers and A. Ben-Hur
The use of gene ontology evidence codes in preventing classifier assessment bias
Bioinformatics, May 1, 2009; 25(9): 1173 - 1177.
[Abstract] [Full Text] [PDF]


Home page
Microbiol. Mol. Biol. Rev.Home page
V. Kunin, A. Copeland, A. Lapidus, K. Mavromatis, and P. Hugenholtz
A Bioinformatician's Guide to Metagenomics
Microbiol. Mol. Biol. Rev., December 1, 2008; 72(4): 557 - 578.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
A. Jocker, F. Hoffmann, A. Groscurth, and H. Schoof
Protein function prediction and annotation in an integrated environment powered by web services (AFAWE)
Bioinformatics, October 15, 2008; 24(20): 2393 - 2394.
[Abstract] [Full Text] [PDF]


Home page
J. Bacteriol.Home page
M. Moscoso, E. Lopez, E. Garcia, and R. Lopez
Implications of Physiological Studies Based on Genomic Sequences: Streptococcus pneumoniae TIGR4 Synthesizes a Functional LytC Lysozyme
J. Bacteriol., September 1, 2005; 187(17): 6238 - 6241.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
S. Aubourg, V. Brunaud, C. Bruyere, M. Cock, R. Cooke, A. Cottet, A. Couloux, P. Dehais, G. Deleage, A. Duclert, et al.
GeneFarm, structural and functional annotation of Arabidopsis gene and protein families by a network of experts
Nucleic Acids Res., January 1, 2005; 33(suppl_1): D641 - D646.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
C. Andreoli, H. Prokisch, K. Hortnagel, J. C. Mueller, M. Munsterkotter, C. Scharfe, and T. Meitinger
MitoP2, an integrated database on mitochondrial proteins in yeast and man
Nucleic Acids Res., January 1, 2004; 32(90001): D459 - 462.
[Abstract] [Full Text] [PDF]


Home page
Cold Spring Harb Symp Quant BiolHome page
M. ASHBURNER, C.J. MUNGALL, and S.E. LEWIS
Ontologies for Biologists: A Community Model for the Annotation of Genomic Data
Cold Spring Harb Symp Quant Biol, January 1, 2003; 68(0): 227 - 236.
[Abstract] [PDF]



Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.