Predicting the prognosis of breast cancer by integrating clinical and microarray data with Bayesian networks
1 Department of Electrical Engineering ESAT-SCD, Katholieke Universiteit Leuven Kasteelpark Arenberg 10, 3001 Leuven, Belgium
2 Medical Direction, National Alliance of Christian Mutualities Haachtsesteenweg 579, 1031 Brussel, Belgium
3 Department of Obstetrics and Gynecology, University Hospital Gasthuisberg, Katholieke Universiteit Leuven Herestraat 49, 3000 Leuven, Belgium
*To whom correspondence should be addressed.
Motivation: Clinical data, such as patient history, laboratory analysis, ultrasound parameterswhich are the basis of day-to-day clinical decision supportare often underused to guide the clinical management of cancer in the presence of microarray data. We propose a strategy based on Bayesian networks to treat clinical and microarray data on an equal footing. The main advantage of this probabilistic model is that it allows to integrate these data sources in several ways and that it allows to investigate and understand the model structure and parameters. Furthermore using the concept of a Markov Blanket we can identify all the variables that shield off the class variable from the influence of the remaining network. Therefore Bayesian networks automatically perform feature selection by identifying the (in)dependency relationships with the class variable.
Results: We evaluated three methods for integrating clinical and microarray data: decision integration, partial integration and full integration and used them to classify publicly available data on breast cancer patients into a poor and a good prognosis group. The partial integration method is most promising and has an independent test set area under the ROC curve of 0.845. After choosing an operating point the classification performance is better than frequently used indices.
Contact: olivier.gevaert{at}esat.kuleuven.be
This article has been cited by other articles:
![]() |
O. GEVAERT, S. VAN VOOREN, and B. DE MOOR A Framework for Elucidating Regulatory Networks Based on Prior Information and Expression Data Ann. N.Y. Acad. Sci., December 1, 2007; 1115(1): 240 - 248. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Saeys, I. Inza, and P. Larranaga A review of feature selection techniques in bioinformatics Bioinformatics, October 1, 2007; 23(19): 2507 - 2517. [Abstract] [Full Text] [PDF] |
||||

