Bioinformatics Vol. 18 no. 9 2002
Pages 1167-1175
© 2002 Oxford University Press
Identification of regulatory elements using a feature selection method
1,*
1 Division of Biostatistics, U. of California,
Berkeley, CA 94720, USA
2 Department of Molecular and Cell Biology,
University of California, Berkeley, CA 94720, USA
3 Life Sciences Division, Ernest Orlando Lawrence Berkeley National Lab,
Berkeley, CA 94720, USA
Received on January 19, 2002
; revised on February 20, 2002
; accepted on March 21, 2002
Motivation: Many methods have been described to identify regulatory motifs in the transcription control regions of genes that exhibit similar patterns of gene expression across a variety of experimental conditions. Here we focus on a single experimental condition, and utilize gene expression data to identify sequence motifs associated with genes that are activated under this experimental condition. We use a linear model with two-way interactions to model gene expression as a function of sequence features (words) present in presumptive transcription control regions. The most relevant features are selected by a feature selection method called stepwise selection with monte carlo cross validation. We apply this method to a publicly available dataset of the yeast Saccharomyces cerevisiae, focussing on the 800 basepairs immediately upstream of each gene's translation start site (the upstream control region (UCR)).
Results: We successfully identify regulatory motifs that are known to be active under the experimental conditions analyzed, and find additional significant sequences that may represent novel regulatory motifs. We also discuss a complementary method that utilizes gene expression data from a single microarray experiment and allows averaging over variety of experimental conditions as an alternative to motif finding methods that act on clusters of co-expressed genes.
Availability: The software is available upon request from the first author or may be downloaded from http://www.stat.berkeley.edu/~sunduz.
Contact: keles{at}stat.berkeley.edu
* To whom correspondence should be addressed.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
K. D. Yokoyama, U. Ohler, and G. A. Wray Measuring spatial preferences at fine-scale resolution identifies known and novel cis-regulatory element candidates and functional motif-pair relationships Nucleic Acids Res., July 1, 2009; 37(13): e92 - e92. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Jurgelenaite, T. M. H. Dijkstra, C. H. M. Kocken, and T. Heskes Gene regulation in the intraerythrocytic cycle of Plasmodium falciparum Bioinformatics, June 15, 2009; 25(12): 1484 - 1491. [Abstract] [Full Text] [PDF] |
||||
![]() |
Q. Zhou and J. S. Liu Extracting sequence features to predict protein-DNA interactions: a comparative study Nucleic Acids Res., July 1, 2008; 36(12): 4137 - 4148. [Abstract] [Full Text] [PDF] |
||||
![]() |
X. Chen, L. Guo, Z. Fan, and T. Jiang W-AlignACE: an improved Gibbs sampling algorithm based on more accurate position weight matrices learned from sequence and gene expression/ChIP-chip data Bioinformatics, May 1, 2008; 24(9): 1121 - 1128. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Saeys, I. Inza, and P. Larranaga A review of feature selection techniques in bioinformatics Bioinformatics, October 1, 2007; 23(19): 2507 - 2517. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Das, T. A. Clark, A. Schweitzer, M. Yamamoto, H. Marr, J. Arribere, S. Minovitsky, A. Poliakov, I. Dubchak, J. E. Blume, et al. A correlation with exon expression approach to identify cis-regulatory elements for tissue-specific alternative splicing Nucleic Acids Res., July 10, 2007; (2007) gkm485v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Wang, G. Chen, and H. Li Group SCAD regression analysis for microarray time course gene expression data Bioinformatics, June 15, 2007; 23(12): 1486 - 1494. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. S Hon and A. N Jain A deterministic motif finding algorithm with application to the human genome Bioinformatics, May 1, 2006; 22(9): 1047 - 1054. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Ruan and W. Zhang A bi-dimensional regression tree approach to the modeling of gene expression regulation Bioinformatics, February 1, 2006; 22(3): 332 - 340. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. Zhong, P. Zeng, P. Ma, J. S. Liu, and Y. Zhu RSIR: regularized sliced inverse regression for motif discovery Bioinformatics, November 15, 2005; 21(22): 4169 - 4175. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. Xing and M. J. van der Laan A causal inference approach for constructing transcriptional regulatory networks Bioinformatics, November 1, 2005; 21(21): 4007 - 4013. [Abstract] [Full Text] [PDF] |
||||
![]() |
H.-K. Tsai, H. H.-S. Lu, and W.-H. Li Statistical methods for identifying yeast cell cycle transcription factors PNAS, September 20, 2005; 102(38): 13532 - 13537. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Sabatti, L. Rohlin, K. Lange, and J. C. Liao Vocabulon: a dictionary model approach for reconstruction and localization of transcription factor binding sites Bioinformatics, April 1, 2005; 21(7): 922 - 931. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Das, N. Banerjee, and M. Q. Zhang Interacting models of cooperative gene regulation PNAS, November 16, 2004; 101(46): 16234 - 16239. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. V. Sun, L. Chen, F. Greil, N. Negre, T.-R. Li, G. Cavalli, H. Zhao, B. van Steensel, and K. P. White Protein-DNA interaction mapping using genomic tiling path microarrays in Drosophila PNAS, August 5, 2003; 100(16): 9428 - 9433. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Roven and H. J. Bussemaker REDUCE: an online tool for inferring cis-regulatory elements and transcriptional module activities from microarray data Nucleic Acids Res., July 1, 2003; 31(13): 3487 - 3490. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. M. Conlon, X. S. Liu, J. D. Lieb, and J. S. Liu Integrating regulatory motif discovery and genome-wide expression analysis PNAS, March 18, 2003; 100(6): 3339 - 3344. [Abstract] [Full Text] [PDF] |
||||


