Bioinformatics Advance Access published online on July 26, 2006
Bioinformatics, doi:10.1093/bioinformatics/btl393
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
1 Division of Biostatistics, School of Public Health, University of Minnesota
* To whom correspondence should be addressed.
Motivation: It is biologically interesting to address whether human blood outgrowth endothelial cells (BOECs) belong to or are closer to large vessel endothelial cells (LVECs) or microvascular endothelial cells (MVECs) based on global expression profiling. An earlier analysis using a hierarchical clustering and a small set of genes suggested that BOECs seemed to be closer to MVECs. By taking advantage of the two known classes, LVEC and MVEC, while allowing BOEC samples to belong to either of the two classes or to form their own new class, we take a semi-supervised learning approach; for high-dimensional data as encountered here, we propose a penalized mixture model with a weighted L1 penalty to realize automatic feature selection while fitting the model. Results: We applied our penalized mixture model to a combined dataset containing 27 BOEC, 28 LVEC and 25 MVEC samples. Analysis results indicated that the BOEC samples appeared to form their own new class. A simulation study confirmed that, compared to the standard mixture model with or without initial variable selection, the penalized mixture model performed much better in identifying relevant genes and forming corresponding clusters. The penalized mixture model seems to be promising for high-dimensional data with the capability of novel class discovery and automatic feature selection.
Received April 21, 2006
Revised July 10, 2006
Accepted July 11, 2006
Article
Semi-supervised learning via penalized mixture model with application to microarray sample classification
Wei Pan 1 *, Xiaotong Shen 2, Aixiang Jiang 3, and Robert P. Hebbel 4
2 School of Statistics, University of Minnesota
3 Department of Biostatistics, Vanderbilt University
4 Vascular Biology Center and Division of Hematology-Oncology-Transplantation, University of Minnesota Medical School
Wei Pan, E-mail: weip{at}biostat.umn.edu
![]()
Abstract
Associate Editor: John Quackenbush
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
S. Ma and J. Huang Penalized feature selection and classification in bioinformatics Brief Bioinform, September 1, 2008; 9(5): 392 - 403. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. C. Tseng Penalized and weighted K-means for clustering with scattered objects and prior information in high-throughput biological data Bioinformatics, September 1, 2007; 23(17): 2247 - 2255. [Abstract] [Full Text] [PDF] |
||||

