Bioinformatics Advance Access originally published online on October 25, 2006
Bioinformatics 2006 22(24):3003-3008; doi:10.1093/bioinformatics/btl544
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Hypothesis testing approaches to the exon prediction problem
1 Statistics Department, University of Barcelona Barcelona, Spain
2 Genetics Unit, Department of Experimental and Health Sciences, Pompeu Fabra University Barcelona, Spain
*To whom correspondence should be addressed.
Motivation: Many gene identification methods assign scores to gene elements prior to their assembly into predicted genes. The scoring system is often based on log-likelihood ratios. These methods usually perform well but it is difficult to interpret how significant a score is.
Results: We have developed several tests of significance for the scores: (1) a sum-of-scores test (SST), (2) an intersection-union test (IUT), based on a multiple hypothesis testing interpretation of an exon's score and (3) a meta-analytical approach (MA), which combines several P-values, corresponding to the exon's parts, to yield a global P-value. We performed simulation studies, which show that the MA has better sensitivity and specificity than other methods and is easier to interpret by non-expert users. This is an improvement over other methods and is especially relevant for users who would like to predict incomplete gene sequences.
Contact: asanchez{at}ub.edu
Supplementary information: Supplementary data are available at Bioinformatics online.
Received on May 18, 2006; revised on October 18, 2006; accepted on October 18, 2006