Bioinformatics Advance Access published online on July 27, 2007
Bioinformatics, doi:10.1093/bioinformatics/btm360
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
AffyProbeMiner: a web resource for computing or retrieving accurately redefined Affymetrix probe sets
1Genomics and Bioinformatics Group, Laboratory of Molecular Pharmacology, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892, USA
2Department of Biostatistics, Bioinformatics, and Biomathematics, Georgetown University Medical Center, 4000 Reservoir Road, NW, Washington, DC 20007, USA
3Department of Electrical and Computer Engineering, University of Maryland, College Park, College Park, MD 20742, USA
4Department of Information Systems, University of Maryland, Baltimore County, 1000 Hilltop Circle, MD 21050, USA
5Department of Computer Science and Electrical Engineering, University of Maryland, Baltimore County, 1000 Hilltop Circle, MD 21050, USA
6Department of Bioinformatics, George Mason University, Fairfax, Virginia, 20110, USA
7Mathematical and Statistical Computing Laboratory, Division of Computational Bioscience, Center for Information Technology, National Institutes of Health, Bethesda, MD 20892, USA
8SRA International, 4300 Fair Lakes Court, Fairfax, VA 22033, USA
To whom correspondence should be addressed. Prof. Hongfang Liu, E-mail: hl224{at}georgetown.edu
| Abstract |
|---|
Motivation: Affymetrix microarrays are widely used to measure global expression of mRNA transcripts. That technology is based on the concept of a probe set. Individual probes within a probe set were originally designated by Affymetrix to hybridize with the same unique mRNA transcript. Because of increasing accuracy in knowledge of genomic sequences, however, a substantial number of the manufacturer's original probe groupings and mappings are now known to be inaccurate and must be corrected. Otherwise, analysis and interpretation of an Affymetrix microarray experiment will be in error.
Results: AffyProbeMiner is a computationally-efficient platform-independent tool that uses all RefSeqs and validated complete coding sequences in GenBank to (1) regroup the individual probes into consistent probe sets and (2) remap the probe sets to the correct sets of mRNA transcripts. The individual probes are grouped into probe sets that are transcript-consistent in that they hybridize to the same mRNA transcript (or transcripts) and, therefore, measure the same entity (or entities). About 65.6% of the probe sets on the HG-U133A chip were affected by the remapping. Pre-computed regrouped and remapped probe sets for many Affymetrix microarrays are made freely available at the AffyProbeMiner web site. Alternatively, we provide a web service that enables the user to perform the remapping for any type of short-oligo commercial or custom array that has an Affymetrix-format Chip Definition File (CDF). Important features that differentiate AffyProbeMiner from other approaches are flexibility in the handling of splice variants, computational efficiency, extensibility, customizability, and user-friendliness of the interface.
Availability: The web interface and software (GPL open source license), are publicly-accessible at http://discover.nci.nih.gov/affyprobeminer.
Contact: hl224{at}georgetown.edu or barry{at}discover.nci.nih.gov
Associate Editor: Dr. Joaquin Dopazo
Received on February 1, 2007; revised on June 20, 2007; accepted on July 9, 2007