Bioinformatics Advance Access originally published online on March 7, 2007
Bioinformatics 2007 23(10):1274-1281; doi:10.1093/bioinformatics/btm087
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
A new method to measure the semantic similarity of GO terms
1School of Computing, Clemson University, Clemson, SC 29634, USA, 2IBM T. J. Watson Research Center, 19 Skyline Drive, Hawthorne, NY 10532, USA and 3Department of Genetics and Biochemistry, Clemson University, Clemson, SC 29634, USA
*To whom correspondence should be addressed.
| Abstract |
|---|
Motivation: Although controlled biochemical or biological vocabularies, such as Gene Ontology (GO) (http://www.geneontology.org), address the need for consistent descriptions of genes in different data sources, there is still no effective method to determine the functional similarities of genes based on gene annotation information from heterogeneous data sources.
Results: To address this critical need, we proposed a novel method to encode a GO term's semantics (biological meanings) into a numeric value by aggregating the semantic contributions of their ancestor terms (including this specific term) in the GO graph and, in turn, designed an algorithm to measure the semantic similarity of GO terms. Based on the semantic similarities of GO terms used for gene annotation, we designed a new algorithm to measure the functional similarity of genes. The results of using our algorithm to measure the functional similarities of genes in pathways retrieved from the saccharomyces genome database (SGD), and the outcomes of clustering these genes based on the similarity values obtained by our algorithm are shown to be consistent with human perspectives. Furthermore, we developed a set of online tools for gene similarity measurement and knowledge discovery.
Availability: The online tools are available at: http://bioinformatics.clemson.edu/G-SESAME
Contact: jzwang{at}cs.clemson.edu
Supplementary information: http://bioinformatics.clemson.edu/Publication/Supplement/gsp.htm
Associate Editor: Jonathan Wren
Received on October 6, 2006; revised on March 1, 2007; accepted on March 1, 2007
This article has been cited by other articles:
![]() |
C. Herrmann, S. Berard, and L. Tichit SimCT: a generic tool to visualize ontology-based relationships for biological objects Bioinformatics, December 1, 2009; 25(23): 3197 - 3198. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Schlicker and M. Albrecht FunSimMat update: new features for exploring functional similarity Nucleic Acids Res., November 18, 2009; (2009) gkp979v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Zhu, J. Zeng, and H. Mamitsuka Enhancing MEDLINE document clustering by incorporating MeSH semantic similarity Bioinformatics, August 1, 2009; 25(15): 1944 - 1951. [Abstract] [Full Text] [PDF] |
||||
![]() |
Z. Hu, J.-H. Hung, Y. Wang, Y.-C. Chang, C.-L. Huang, M. Huyck, and C. DeLisi VisANT 3.5: multi-scale network visualization, analysis and inference based on the gene ontology Nucleic Acids Res., July 1, 2009; 37(suppl_2): W115 - W121. [Abstract] [Full Text] [PDF] |
||||
![]() |
Z. Du, L. Li, C.-F. Chen, P. S. Yu, and J. Z. Wang G-SESAME: web tools for GO-term-based gene similarity analysis and knowledge discovery Nucleic Acids Res., July 1, 2009; 37(suppl_2): W345 - W349. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Du, G. Feng, J. Flatow, J. Song, M. Holko, W. A. Kibbe, and S. M. Lin From disease ontology to disease-ontology lite: statistical methods to adapt a general-purpose ontology for the test of gene-ontology associations Bioinformatics, June 15, 2009; 25(12): i63 - i68. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Ruths, D. Ruths, and L. Nakhleh GS2: an efficiently computable measure of GO-based similarity of gene sets Bioinformatics, May 1, 2009; 25(9): 1178 - 1184. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Yilmaz, P. Jonveaux, C. Bicep, L. Pierron, M. Smail-Tabbone, and M.D. Devignes Gene-disease relationship discovery based on model-driven data integration and database view definition Bioinformatics, January 15, 2009; 25(2): 230 - 236. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Boden and T. L. Bailey Associating transcription factor-binding site motifs with target GO terms and target genes Nucleic Acids Res., July 1, 2008; 36(12): 4108 - 4117. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Schlicker and M. Albrecht FunSimMat: a comprehensive functional similarity database Nucleic Acids Res., January 11, 2008; 36(suppl_1): D434 - D439. [Abstract] [Full Text] [PDF] |
||||

