Bioinformatics Advance Access published online on March 18, 2008
Bioinformatics, doi:10.1093/bioinformatics/btn096
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
An analytical pipeline for genomic representations used for cytosine methylation studies
Departments of 1Molecular Genetics, 3Medicine (Infectious Diseases), 4Microbiology & Immunology, 6Pathology, and 8Medicine (Hematology), Albert Einstein College of Medicine, 1300 Morris Park Avenue, Bronx, NY 10461, USA
2Department of Biostatistics, Virginia Commonwealth University, 730 East Broad Street, Richmond, VA 23298, USA
5Roche NimbleGen, 1 Science Court, Madison, WI 53711 USA
7Bioinformatics Shared Resource, Albert Einstein College of Medicine, 1300 Morris Park Avenue, Bronx, NY 10461, USA
*To whom correspondence should be addressed. Dr. John Greally, E-mail: jgreally{at}aecom.yu.edu
| Abstract |
|---|
Motivation: Representations of the genome can be generated by the selection of a subpopulation of restriction fragments using ligation-mediated PCR. Such representations form the basis for a number of high-throughput assays, including the HELP assay to study cytosine methylation. We find that HELP data analysis is complicated not only by PCR amplification heterogeneity but also by a complex and variable distribution of cytosine methylation. A major influence on the PCR amplification is the size of the restriction fragment, requiring a To address this, we created an analytical pipeline and novel quantile normalization approach that reduces the influence of fragment length on signal intensity. We created an analytical pipeline including the quantile normalization approach that improves concordance between microarray-derived HELP data and single locus validation results, demonstrating the powervalue of the analytical approach. A major influence on the PCR amplification is the size of the restriction fragment, requiring a quantile normalization approach that reduces the influence of fragment length on signal intensity. Here we describe all of the components of the pipeline, which can also be applied to data derived from other assays based on genomic representations.
Contact: jgreally{at}aecom.yu.edu
Associate Editor: Dr. Joaquin Dopazo
Received on November 7, 2007; revised on January 24, 2008; accepted on March 7, 2008