Bioinformatics Vol. 19 no. 10 2003
Pages 1275-1283
© 2003 Oxford University Press
Investigating semantic similarity measures across the Gene Ontology: the relationship between sequence and annotation
Department of Computer Science, University of Manchester, Oxford Road, Manchester, M13 9PL, UK
Received on July 2, 2002
; revised on October 23, 2002 and December 23, 2002
; accepted on December 6, 2002
Motivation: Many bioinformatics data resources not only hold data in the form of sequences, but also as annotation. In the majority of cases, annotation is written as scientific natural language: this is suitable for humans, but not particularly useful for machine processing. Ontologies offer a mechanism by which knowledge can be represented in a form capable of such processing. In this paper we investigate the use of ontological annotation to measure the similarities in knowledge content or semantic similarity between entries in a data resource. These allow a bioinformatician to perform a similarity measure over annotation in an analogous manner to those performed over sequences. A measure of semantic similarity for the knowledge component of bioinformatics resources should afford a biologist a new tool in their repetoire of analyses.
Results: We present the results from experiments that investigate the validity of using semantic similarity by comparison with sequence similarity. We show a simple extension that enables a semantic search of the knowledge held within sequence databases.
Availability: Software available from http://www.russet.org.uk
Contact: p.lord{at}russet.org.uk
* To whom correspondence should be addressed.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
J. Pandey, M. Koyuturk, S. Subramaniam, and A. Grama Functional coherence in domain interaction networks Bioinformatics, August 15, 2008; 24(16): i28 - i34. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. Zhang, B.-H. Park, T. Karpinets, and N. F. Samatova From pull-down data to protein interaction networks and complexes with biological relevance Bioinformatics, April 1, 2008; 24(7): 979 - 986. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. N. Wass and M. J. E. Sternberg ConFunc--functional annotation in the twilight zone Bioinformatics, March 15, 2008; 24(6): 798 - 806. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Cordero, M. Botta, and R. A. Calogero Microarray data analysis and mining approaches Brief Funct Genomic Proteomic, January 22, 2008; (2008) elm034v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Yang, Y. Li, H. Xiao, Q. Liu, M. Zhang, J. Zhu, W. Ma, C. Yao, J. Wang, D. Wang, et al. Gaining confidence in biological interpretation of the microarray data: the functional consistence of the significant GO categories Bioinformatics, January 15, 2008; 24(2): 265 - 271. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Schlicker and M. Albrecht FunSimMat: a comprehensive functional similarity database Nucleic Acids Res., January 11, 2008; 36(suppl_1): D434 - D439. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Cakmak and G. Ozsoyoglu Mining biological networks for unknown pathways Bioinformatics, October 15, 2007; 23(20): 2775 - 2783. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Lerman and B. E. Shakhnovich Defining functional distance using manifold embeddings of gene ontology annotations PNAS, July 3, 2007; 104(27): 11334 - 11339. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. A. Baumgartner Jr, K. B. Cohen, L. M. Fox, G. Acquaah-Mensah, and L. Hunter Manual curation is not sufficient for annotation of genomic databases Bioinformatics, July 1, 2007; 23(13): i41 - i48. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Tao, L. Sam, J. Li, C. Friedman, and Y. A. Lussier Information theory applied to the sparse gene ontology annotation network to predict novel gene function Bioinformatics, July 1, 2007; 23(13): i529 - i538. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. Friedberg Automated protein function prediction--the genomic challenge Brief Bioinform, September 1, 2006; 7(3): 225 - 242. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Tiffin, E. Adie, F. Turner, H. G. Brunner, M. A. van Driel, M. Oti, N. Lopez-Bigas, C. Ouzounis, C. Perez-Iratxeta, M. A. Andrade-Navarro, et al. Computational disease gene identification: a concert of methods prioritizes type 2 diabetes and obesity candidate genes Nucleic Acids Res., June 6, 2006; 34(10): 3067 - 3081. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. Friedberg, M. Jambon, and A. Godzik New avenues in protein function prediction. Protein Sci., June 1, 2006; 15(6): 1527 - 1529. [Full Text] [PDF] |
||||
![]() |
X. Guo, R. Liu, C. D. Shriver, H. Hu, and M. N. Liebman Assessing semantic similarity measures for the characterization of human regulatory pathways Bioinformatics, April 15, 2006; 22(8): 967 - 973. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. A. Adie, R. R. Adams, K. L. Evans, D. J. Porteous, and B. S. Pickard SUSPECTS: enabling fast and effective prioritization of positional candidates Bioinformatics, March 15, 2006; 22(6): 773 - 774. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Tuikkala, L. Elo, O. S. Nevalainen, and T. Aittokallio Improving missing value estimation in microarray data with gene ontology Bioinformatics, March 1, 2006; 22(5): 566 - 572. [Abstract] [Full Text] [PDF] |
||||
![]() |
X. Wu, L. Zhu, J. Guo, D.-Y. Zhang, and K. Lin Prediction of yeast protein-protein interaction network: insights from the Gene Ontology and annotations. Nucleic Acids Res., January 1, 2006; 34(7): 2137 - 2150. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Yu, W. Sun, S. Yuan, and K.-C. Li Study of coordinative gene expression at the biological process level Bioinformatics, September 15, 2005; 21(18): 3651 - 3657. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. H. Lee and D. Lee Modularized learning of genetic interaction networks from biological annotations and mRNA expression data Bioinformatics, June 1, 2005; 21(11): 2739 - 2747. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. R. Brown and I. Jurisica Online Predicted Human Interaction Database Bioinformatics, May 1, 2005; 21(9): 2076 - 2082. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. K. Lee, A. K. Hsu, J. Sajdak, J. Qin, and P. Pavlidis Coexpression Analysis of Human Genes Across Many Microarray Data Sets Genome Res., June 1, 2004; 14(6): 1085 - 1094. [Abstract] [Full Text] [PDF] |
||||






