Skip Navigation



Bioinformatics Advance Access published online on November 16, 2006

Bioinformatics, doi:10.1093/bioinformatics/btl571
This Article
Right arrow Advance Access manuscript (PDF) Freely available
Right arrowOA All Versions of this Article:
23/2/232    most recent
btl571v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Google Scholar
Right arrow Articles by Tian, Y.
Right arrow Articles by Patel, J. M.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Tian, Y.
Right arrow Articles by Patel, J. M.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© 2006 The Author(s)
Received August 22, 2006
Revised November 7, 2006
Accepted November 8, 2006

Article

SAGA: a subgraph matching tool for biological graphs

Yuanyuan Tian 1, Richard C. McEachin 2, Carlos Santos 3, David J. States 3, and Jignesh M. Patel 1 *

1 Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, MI 48109, USA
2 National Center for Integrative Biomedical Informatics, University of Michigan, Ann Arbor, MI 48109, USA
3 Department of Human Genetics and Bioinformatics Program, University of Michigan, Ann Arbor, MI 48109, USA

* To whom correspondence should be addressed.
Jignesh M. Patel, E-mail: jignesh{at}eecs.umich.edu


   Abstract

Motivation: With the rapid increase in the availability of biological graph datasets, there is a growing need for effective and efficient graph querying methods. Due to the noisy and incomplete characteristics of these datasets, exact graph matching methods have limited use and approximate graph matching methods are required. Unfortunately, existing graph matching methods are too restrictive as they only allow exact or near exact graph matching. This paper presents a novel approximate graph matching technique called SAGA. This technique employs a flexible model for computing graph similarity, which allows for node gaps, node mismatches, and graph structural differences. SAGA employs an indexing technique that allows it to efficiently evaluate queries even against large graph datasets.

Results: SAGA has been used to query biological pathways and literature datasets, which has revealed interesting similarities between distinct pathways that cannot be found by existing methods. These matches associate seemingly unrelated biological processes, connect studies in different sub-areas of biomedical research, and thus pose hypotheses for new discoveries. SAGA is also orders of magnitude faster than existing methods.

Availability: SAGA can be accessed freely via the web at http://www.eecs.umich.edu/saga. Binaries are also freely available at this website.


Associate Editor: Martin Bishop
Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?




Disclaimer:
Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.