Skip Navigation



Bioinformatics Advance Access published online on December 12, 2006

Bioinformatics, doi:10.1093/bioinformatics/btl630
This Article
Right arrow Advance Access manuscript (PDF) Freely available
Right arrow Supplementary data
Right arrow All Versions of this Article:
23/4/458    most recent
btl630v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Baek, J.
Right arrow Articles by McLachlan, G. J.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Baek, J.
Right arrow Articles by McLachlan, G. J.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author (2006). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oxfordjournals.org
Received September 20, 2006
Revised November 30, 2006
Accepted December 6, 2006

Article

Segmentation and intensity estimation of microarray images using a gamma-t mixture model

Jangsun Baek 1 *, Young Sook Son 1, and Geoffrey J. McLachlan 2

1 Department of Statistics, Chonnam National University, Gwangju 500-757, South Korea
2 Department of Mathematics and Institute for Molecular Bioscience, University of Queensland, Brisbane, QLD 4072, Australia

* To whom correspondence should be addressed.
Jangsun Baek, E-mail: jbaek{at}chonnam.ac.kr


   Abstract

Motivation: We present a new approach to the analysis of images for complementary DNA microarray experiments. The image segmentation and intensity estimation are performed simultaneously by adopting a two-component mixture model. One component of this mixture corresponds to the distribution of the background intensity, while the other corresponds to the distribution of the foreground intensity. The intensity measurement is a bivariate vector consisting of red and green intensities. The background intensity component is modeled by the bivariate gamma distribution, whose marginal densities for the red and green intensities are independent three-parameter gamma distributions with different parameters. The foreground intensity component is taken to be the bivariate t distribution, with the constraint that the mean of the foreground is greater than that of the background for each of the two colors. The degrees of freedom of this t distribution are inferred from the data but they could be specified in advance to reduce the computation time. Also, the covariance matrix is not restricted to being diagonal and so it allows for nonzero correlation between R and G foreground intensities. This gamma-t mixture model is fitted by maximum likelihood via the EM algorithm. A final step is executed whereby nonparametric (kernel) smoothing is undertaken of the posterior probabilities of component membership.

The main advantages of this approach are: (1) it enjoys the wellknown strengths of a mixture model, namely flexibility and adaptability to the data; (2) it considers the segmentation and intensity simultaneously and not separately as in commonly used existing software, and it also works with the red and green intensities in a bivariate framework as opposed to their separate estimation via univariate methods; (3) the use of the three-parameter gamma distribution for the background red and green intensities provides a much better fit than the normal (log normal) or t distributions; (4) the use of the bivariate t distribution for the foreground intensity provides a model that is less sensitive to extreme observations; (5) as a consequence of the aforementioned properties, it allows segmentation to be undertaken for a wide range of spot shapes, including doughnut, sickle shape, and artifacts.

Results: We apply our method for gridding, segmentation, and estimation to cDNA microarray real images and artificial data. Our method provides better segmentation results in spot shapes as well as intensity estimation than Spot and spotSegmentation R language softwares. It detected blank spots as well as bright artifact for the real data, and estimated spot intensities with high accuracy for the synthetic data.

Availability: The algorithms were implemented in Matlab. The Matlab codes implementing both the gridding and segmentation/estimation are available upon request.


Associate Editor: Satoru Miyano
Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
BioinformaticsHome page
A. Daskalakis, D. Cavouras, P. Bougioukos, S. Kostopoulos, D. Glotsos, I. Kalatzis, G. C. Kagadis, C. Argyropoulos, and G. Nikiforidis
Improving gene quantification by adjustable spot-image restoration
Bioinformatics, September 1, 2007; 23(17): 2265 - 2272.
[Abstract] [Full Text] [PDF]



Disclaimer:
Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.