Bioinformatics Vol. 19 no. 10 2003
Pages 1275-1283
© 2003 Oxford University Press
Investigating semantic similarity measures across the Gene Ontology: the relationship between sequence and annotation
Department of Computer Science, University of Manchester, Oxford Road, Manchester, M13 9PL, UK
Received on July 2, 2002
; revised on October 23, 2002 and December 23, 2002
; accepted on December 6, 2002
Motivation: Many bioinformatics data resources not only hold data in the form of sequences, but also as annotation. In the majority of cases, annotation is written as scientific natural language: this is suitable for humans, but not particularly useful for machine processing. Ontologies offer a mechanism by which knowledge can be represented in a form capable of such processing. In this paper we investigate the use of ontological annotation to measure the similarities in knowledge content or semantic similarity between entries in a data resource. These allow a bioinformatician to perform a similarity measure over annotation in an analogous manner to those performed over sequences. A measure of semantic similarity for the knowledge component of bioinformatics resources should afford a biologist a new tool in their repetoire of analyses.
Results: We present the results from experiments that investigate the validity of using semantic similarity by comparison with sequence similarity. We show a simple extension that enables a semantic search of the knowledge held within sequence databases.
Availability: Software available from http://www.russet.org.uk
Contact: p.lord{at}russet.org.uk
* To whom correspondence should be addressed.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
C. Herrmann, S. Berard, and L. Tichit SimCT: a generic tool to visualize ontology-based relationships for biological objects Bioinformatics, December 1, 2009; 25(23): 3197 - 3198. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Schlicker and M. Albrecht FunSimMat update: new features for exploring functional similarity Nucleic Acids Res., November 18, 2009; (2009) gkp979v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Obayashi and K. Kinoshita Rank of Correlation Coefficient as a Comparable Measure for Biological Significance of Gene Coexpression DNA Res, October 1, 2009; 16(5): 249 - 260. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Zhu, J. Zeng, and H. Mamitsuka Enhancing MEDLINE document clustering by incorporating MeSH semantic similarity Bioinformatics, August 1, 2009; 25(15): 1944 - 1951. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Dotan-Cohen, S. Kasif, and A. A. Melkman Seeing the forest for the trees: using the Gene Ontology to restructure hierarchical clustering Bioinformatics, July 15, 2009; 25(14): 1789 - 1795. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. D. Wren A global meta-analysis of microarray expression data to predict unknown gene functions and estimate the literature-data divide Bioinformatics, July 1, 2009; 25(13): 1694 - 1701. [Abstract] [Full Text] [PDF] |
||||
![]() |
Z. Du, L. Li, C.-F. Chen, P. S. Yu, and J. Z. Wang G-SESAME: web tools for GO-term-based gene similarity analysis and knowledge discovery Nucleic Acids Res., July 1, 2009; 37(suppl_2): W345 - W349. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Wang, B. Kakaradov, S. R. Collins, L. Karotki, D. Fiedler, M. Shales, K. M. Shokat, T. C. Walther, N. J. Krogan, and D. Koller A Complex-based Reconstruction of the Saccharomyces cerevisiae Interactome Mol. Cell. Proteomics, June 1, 2009; 8(6): 1361 - 1381. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. X. C. N. Valente, S. B. Roberts, G. A. Buck, and Y. Gao Functional organization of the yeast proteome by a yeast interactome map PNAS, February 3, 2009; 106(5): 1490 - 1495. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Yilmaz, P. Jonveaux, C. Bicep, L. Pierron, M. Smail-Tabbone, and M.D. Devignes Gene-disease relationship discovery based on model-driven data integration and database view definition Bioinformatics, January 15, 2009; 25(2): 230 - 236. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Llewellyn and D. S. Eisenberg Annotating proteins with generalized functional linkages PNAS, November 18, 2008; 105(46): 17700 - 17705. [Abstract] [Full Text] [PDF] |
||||
![]() |
T.-t. Soong, K. O. Wrzeszczynski, and B. Rost Physical protein-protein interactions predicted from microarrays Bioinformatics, November 15, 2008; 24(22): 2608 - 2614. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Pandey, M. Koyuturk, S. Subramaniam, and A. Grama Functional coherence in domain interaction networks Bioinformatics, August 15, 2008; 24(16): i28 - i34. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. Zhang, B.-H. Park, T. Karpinets, and N. F. Samatova From pull-down data to protein interaction networks and complexes with biological relevance Bioinformatics, April 1, 2008; 24(7): 979 - 986. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. N. Wass and M. J. E. Sternberg ConFunc--functional annotation in the twilight zone Bioinformatics, March 15, 2008; 24(6): 798 - 806. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Cordero, M. Botta, and R. A. Calogero Microarray data analysis and mining approaches Brief Funct Genomic Proteomic, January 22, 2008; (2008) elm034v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Yang, Y. Li, H. Xiao, Q. Liu, M. Zhang, J. Zhu, W. Ma, C. Yao, J. Wang, D. Wang, et al. Gaining confidence in biological interpretation of the microarray data: the functional consistence of the significant GO categories Bioinformatics, January 15, 2008; 24(2): 265 - 271. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Schlicker and M. Albrecht FunSimMat: a comprehensive functional similarity database Nucleic Acids Res., January 11, 2008; 36(suppl_1): D434 - D439. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Cakmak and G. Ozsoyoglu Mining biological networks for unknown pathways Bioinformatics, October 15, 2007; 23(20): 2775 - 2783. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Lerman and B. E. Shakhnovich Defining functional distance using manifold embeddings of gene ontology annotations PNAS, July 3, 2007; 104(27): 11334 - 11339. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. A. Baumgartner Jr, K. B. Cohen, L. M. Fox, G. Acquaah-Mensah, and L. Hunter Manual curation is not sufficient for annotation of genomic databases Bioinformatics, July 1, 2007; 23(13): i41 - i48. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Tao, L. Sam, J. Li, C. Friedman, and Y. A. Lussier Information theory applied to the sparse gene ontology annotation network to predict novel gene function Bioinformatics, July 1, 2007; 23(13): i529 - i538. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. Friedberg Automated protein function prediction--the genomic challenge Brief Bioinform, September 1, 2006; 7(3): 225 - 242. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Tiffin, E. Adie, F. Turner, H. G. Brunner, M. A. van Driel, M. Oti, N. Lopez-Bigas, C. Ouzounis, C. Perez-Iratxeta, M. A. Andrade-Navarro, et al. Computational disease gene identification: a concert of methods prioritizes type 2 diabetes and obesity candidate genes Nucleic Acids Res., June 6, 2006; 34(10): 3067 - 3081. [Abstract] [Full Text] [PDF] |
||||
![]() |
X. Guo, R. Liu, C. D. Shriver, H. Hu, and M. N. Liebman Assessing semantic similarity measures for the characterization of human regulatory pathways Bioinformatics, April 15, 2006; 22(8): 967 - 973. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. A. Adie, R. R. Adams, K. L. Evans, D. J. Porteous, and B. S. Pickard SUSPECTS: enabling fast and effective prioritization of positional candidates Bioinformatics, March 15, 2006; 22(6): 773 - 774. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Tuikkala, L. Elo, O. S. Nevalainen, and T. Aittokallio Improving missing value estimation in microarray data with gene ontology Bioinformatics, March 1, 2006; 22(5): 566 - 572. [Abstract] [Full Text] [PDF] |
||||
![]() |
X. Wu, L. Zhu, J. Guo, D.-Y. Zhang, and K. Lin Prediction of yeast protein-protein interaction network: insights from the Gene Ontology and annotations. Nucleic Acids Res., January 1, 2006; 34(7): 2137 - 2150. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Yu, W. Sun, S. Yuan, and K.-C. Li Study of coordinative gene expression at the biological process level Bioinformatics, September 15, 2005; 21(18): 3651 - 3657. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. H. Lee and D. Lee Modularized learning of genetic interaction networks from biological annotations and mRNA expression data Bioinformatics, June 1, 2005; 21(11): 2739 - 2747. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. R. Brown and I. Jurisica Online Predicted Human Interaction Database Bioinformatics, May 1, 2005; 21(9): 2076 - 2082. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. K. Lee, A. K. Hsu, J. Sajdak, J. Qin, and P. Pavlidis Coexpression Analysis of Human Genes Across Many Microarray Data Sets Genome Res., June 1, 2004; 14(6): 1085 - 1094. [Abstract] [Full Text] [PDF] |
||||







