Bioinformatics Advance Access published online on May 27, 2004
Bioinformatics, doi:10.1093/bioinformatics/bth323
Bioinformatics © Oxford University Press 2004; all rights reserved
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
1 School of Engineering and Computer Science, Exeter University, Exeter EX4 4QF, UK
* To whom correspondence should be addressed. E-mail: Z.R.Yang{at}exeter.ac.uk.
Motivation: It is understood that clustering genes is useful for exploring scientific knowledge from DNA microarray gene expression data. The explored knowledge can be finally used for annotating biological function for novel genes. Representing the explored knowledge in an efficient way is then closely related to the classification accuracy. However, this issue has not yet been paid the attention it deserves. Result: A novel method based on both template theory in cognitive psychology and pattern recognition is developed in this study for representing knowledge extracted from cluster analysis effectively. The basic principle is to represent the knowledge according to the relationship between genes and a found cluster structure. Based on this novel knowledge representation method, a pattern recognition algorithm (the decision tree algorithm C4.5) is then used to construct a classifier for annotating biological functions of novel genes. The experiments on five published data sets show that this method has improved the classification performance compared with the conventional method. The statistical tests indicate that this improvement is significant. Availability: The software package can be obtained by request to the author.
Revised November 6, 2003
Accepted November 13, 2003
Article
Mining gene expression data based on template theory
![]()
Abstract ![]()
CiteULike
Connotea
Del.icio.us What's this?