Skip Navigation

This Article
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow FREE Full Text (Screen PDF)
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (42)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Shmulevich, I.
Right arrow Articles by Zhang, W.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Shmulevich, I.
Right arrow Articles by Zhang, W.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Bioinformatics Vol. 18 no. 4 2002
Pages 555-565
© 2002 Oxford University Press

Binary analysis and optimization-based normalization of gene expression data

Ilya Shmulevich and Wei Zhang

Cancer Genomics Laboratory, Department of Pathology, University of Texas MD Anderson Cancer Center, 1515 Holcombe Blvd, Box 85, Houston, TX 77030, USA

Received on August 3, 2001 ; revised on October 11, 2001 ; accepted on November 23, 2001

Motivation: Most approaches to gene expression analysis use real-valued expression data, produced by high-throughput screening technologies, such as microarrays. Often, some measure of similarity must be computed in order to extract meaningful information from the observed data. The choice of this similarity measure frequently has a profound effect on the results of the analysis, yet no standards exist to guide the researcher.

Results: To address this issue, we propose to analyse gene expression data entirely in the binary domain. The natural measure of similarity becomes the Hamming distance and reflects the notion of similarity used by biologists. We also develop a novel data-dependent optimization-based method, based on Genetic Algorithms (GAs), for normalizing gene expression data. This is a necessary step before quantizing gene expression data into the binary domain and generally, for comparing data between different arrays. We then present an algorithm for binarizing gene expression data and illustrate the use of the above methods on two different sets of data. Using Multidimensional Scaling, we show that a reasonable degree of separation between different tumor types in each data set can be achieved by working solely in the binary domain. The binary approach offers several advantages, such as noise resilience and computational efficiency, making it a viable approach to extracting meaningful biological information from gene expression data.

Contact: is{at}ieee.org


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
BioinformaticsHome page
J. Dingel and O. Milenkovic
List-decoding methods for inferring polynomials in finite dynamical gene network models
Bioinformatics, July 1, 2009; 25(13): 1686 - 1693.
[Abstract] [Full Text] [PDF]


Home page
J. Clin. Pathol.Home page
G Malouf, B Falissard, D Azoulay, F Callea, L D Ferrell, Z D Goodman, Y Hayashi, H-C Hsu, S G Hubscher, M Kojiro, et al.
Is histological diagnosis of primary liver carcinomas with fibrous stroma reproducible among experts?
J. Clin. Pathol., June 1, 2009; 62(6): 519 - 524.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
W.-K. Ching, S. Zhang, M. K. Ng, and T. Akutsu
An approximation method for solving the steady-state probability distribution of probabilistic Boolean networks
Bioinformatics, June 15, 2007; 23(12): 1511 - 1518.
[Abstract] [Full Text] [PDF]


Home page
Brief BioinformHome page
P. Larranaga, B. Calvo, R. Santana, C. Bielza, J. Galdiano, I. Inza, J. A. Lozano, R. Armananzas, G. Santafe, A. Perez, et al.
Machine learning in bioinformatics
Brief Bioinform, March 1, 2006; 7(1): 86 - 112.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
I. Shmulevich, S. A. Kauffman, and M. Aldana
Eukaryotic cells are dynamically ordered or critical but not chaotic
PNAS, September 20, 2005; 102(38): 13439 - 13444.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
K. Jampachaisri, L. Valinsky, J. Borneman, and S. J. Press
Classification of oligonucleotide fingerprints: application for microbial community and gene expression analyses
Bioinformatics, July 15, 2005; 21(14): 3122 - 3130.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
Y.-h. Taguchi and Y. Oono
Relational patterns of gene expression via non-metric multidimensional scaling analysis
Bioinformatics, March 15, 2005; 21(6): 730 - 740.
[Abstract] [Full Text] [PDF]


Home page
Cancer Res.Home page
H. Wang, H. Wang, W. Shen, H. Huang, L. Hu, L. Ramdas, Y.-H. Zhou, W. S-L. Liao, G. N. Fuller, and W. Zhang
Insulin-like Growth Factor Binding Protein 2 Enhances Glioblastoma Invasion by Activating Invasion-enhancing Genes
Cancer Res., August 1, 2003; 63(15): 4315 - 4321.
[Abstract] [Full Text] [PDF]


Home page
Molecular Cancer TherapeuticsHome page
X. Zhou, X. Wang, and E. R. Dougherty
Binarization of Microarray Data on the Basis of a Mixture Model
Mol. Cancer Ther., July 1, 2003; 2(7): 679 - 684.
[Abstract] [Full Text] [PDF]



Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.