Bioinformatics Advance Access originally published online on March 12, 2008
Bioinformatics 2008 24(9):1183-1190; doi:10.1093/bioinformatics/btn098
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
A pattern recognition approach to infer time-lagged genetic interactions
1Institute of Biomedical Engineering, National Taiwan University, Taipei 106, 2Institute of Statistical Science, Academia Sinica, Taipei 115 and 3Genome Research Center, National Yang-Ming University, Taipei 112, Taiwan
*To whom correspondence should be addressed.
| Abstract |
|---|
Motivation: For any time-course microarray data in which the gene interactions and the associated paired patterns are dependent, the proposed pattern recognition (PARE) approach can infer time-lagged genetic interactions, a challenging task due to the small number of time points and large number of genes. PARE utilizes a non-linear score to identify subclasses of gene pairs with different time lags. In each subclass, PARE extracts non-linear characteristics of paired gene-expression curves and learns weights of the decision score applying an optimization algorithm to microarray gene-expression data (MGED) of some known interactions, from biological experiments or published literature. Namely, PARE integrates both MGED and existing knowledge via machine learning, and subsequently predicts the other genetic interactions in the subclass.
Results: PARE, a time-lagged correlation approach and the latest advance in graphical Gaussian models were applied to predict 112 (132) pairs of TC/TD (transcriptional regulatory) interactions. Checked against qRT-PCR results (published literature), their true positive rates are 73% (77%), 46% (51%), and 52% (59%), respectively. The false positive rates of predicting TC and TD (AT and RT) interactions in the yeast genome are bounded by 13 and 10% (10 and 14%), respectively. Several predicted TC/TD interactions are shown to coincide with existing pathways involving Sgs1, Srs2 and Mus81. This reinforces the possibility of applying genetic interactions to predict pathways of protein complexes. Moreover, some experimentally testable gene interactions involving DNA repair are predicted.
Availability: Supplementary data and PARE software are available at http://www.stat.sinica.edu.tw/~gshieh/pare.htm.
Contact: gshieh{at}stat.sinica.edu.tw
Associate Editor: Olga Troyanskaya
1Note that this proportion can be as low as 30% since this is a preliminary check for the applicability of PARE. However, the 0.05 level of significance for Fisher's exact test in Steps 3–6 is essential, because this test is to justify that there is indeed a significant association between gene interactions and their corresponding paired patterns. If TPR is not required, then the time lag of each gene pair can be predicted by Steps 1–5. These procedures are demonstrated in the two applications in the RESULTS Section, and have been automated and integrated into the PARE algorithm.
Received on November 13, 2007; revised on February 4, 2008; accepted on March 6, 2008