Skip Navigation



Bioinformatics Advance Access published online on May 27, 2004

Bioinformatics, doi:10.1093/bioinformatics/bth323
Bioinformatics © Oxford University Press 2004; all rights reserved
This Article
Right arrow Advance Access manuscript (PDF) Freely available
Right arrow All Versions of this Article:
20/16/2759    most recent
bth323v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Yang, Z. R.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Yang, Z. R.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Received August 21, 2003
Revised November 6, 2003
Accepted November 13, 2003

Article

Mining gene expression data based on template theory

Zheng Rong Yang 1*

1 School of Engineering and Computer Science, Exeter University, Exeter EX4 4QF, UK

* To whom correspondence should be addressed. E-mail: Z.R.Yang{at}exeter.ac.uk.


   Abstract

Motivation: It is understood that clustering genes is useful for exploring scientific knowledge from DNA microarray gene expression data. The explored knowledge can be finally used for annotating biological function for novel genes. Representing the explored knowledge in an efficient way is then closely related to the classification accuracy. However, this issue has not yet been paid the attention it deserves.

Result: A novel method based on both template theory in cognitive psychology and pattern recognition is developed in this study for representing knowledge extracted from cluster analysis effectively. The basic principle is to represent the knowledge according to the relationship between genes and a found cluster structure. Based on this novel knowledge representation method, a pattern recognition algorithm (the decision tree algorithm C4.5) is then used to construct a classifier for annotating biological functions of novel genes. The experiments on five published data sets show that this method has improved the classification performance compared with the conventional method. The statistical tests indicate that this improvement is significant.

Availability: The software package can be obtained by request to the author.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?




Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.