Bayesian neural network approaches to ovarian cancer identification from high-resolution mass spectrometry data
1School of Electronics Engineering and Computer Science, Peking University China
2Information and Telecommunication Technology Center, University of Kansas KS USA
3Department of Electrical Engineering and Computer Science, University of Kansas KS USA
*To whom correspondence should be addressed at 2001 Eaton Hall, 1520 West 15th Street, Lawrence, KS 66045, USA.
Motivation: The classification of high-dimensional data is always a challenge to statistical machine learning. We propose a novel method named shallow feature selection that assigns each feature a probability of being selected based on the structure of training data itself. Independent of particular classifiers, the high dimension of biodata can be fleetly reduced to an applicable case for consequential processing. Moreover, to improve both efficiency and performance of classification, these prior probabilities are further used to specify the distributions of top-level hyperparameters in hierarchical models of Bayesian neural network (BNN), as well as the parameters in Gaussian process models.
Results: Three BNN approaches were derived and then applied to identify ovarian cancer from NCI's high-resolution mass spectrometry data, which yielded an excellent performance in 1000 independent k-fold cross validations (k = 2,...,10). For instance, indices of average sensitivity and specificity of 98.56 and 98.42%, respectively, were achieved in the 2-fold cross validations. Furthermore, only one control and one cancer were misclassified in the leave-one-out cross validation. Some other popular classifiers were also tested for comparison.
Availability: The programs implemented in MatLab, R and Neal's fbm.2004-11-10.
Contact: xwchen{at}ku.edu
Received on January 15, 2005; accepted on March 27, 2005
This article has been cited by other articles:
![]() |
Z. Wang, Y.-c. I. Chang, Z. Ying, L. Zhu, and Y. Yang A parsimonious threshold-independent protein feature selection method through the area under receiver operating characteristic curve Bioinformatics, October 15, 2007; 23(20): 2788 - 2794. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Saeys, I. Inza, and P. Larranaga A review of feature selection techniques in bioinformatics Bioinformatics, October 1, 2007; 23(19): 2507 - 2517. [Abstract] [Full Text] [PDF] |
||||
