Bioinformatics Advance Access published online on February 26, 2004
Bioinformatics, doi:10.1093/bioinformatics/bth153
Bioinformatics © Oxford University Press 2004; all rights reserved
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
1 1Department of Statistics, University of California, 367 Evans Hall, Berkeley, CA 94720, USA
* To whom correspondence should be addressed. E-mail: jordan{at}cs.berkeley.edu.
Motivation: Phylogenetic shadowing is a comparative genomics principle which allows for the discovery of conserved regions in sequences from multiple closely-related organisms. We develop a formal probabilistic framework for combining phylogenetic shadowing with feature-based functional annotation methods. The resulting model, a generalized hidden Markov phylogeny (GHMP), applies to a variety of situations where functional regions are to be inferred from evolutionary constraints. Results: We show how GHMPs can be used to predict complete shared gene structures in multiple primate sequences. We also describe SHADOWER, our implementation of such a prediction system. We find that SHADOWER outperforms previously reported ab initio gene finders, including comparative human-mouse approaches, on a small sample of diverse exonic regions. Finally, we report on an empirical analysis of SHADOWER's performance which reveals that as few as five well-chosen species may suffice to attain maximal sensitivity and specificity in exon demarcation. Availability: A Web server is available at http://bonaire.lbl.gov/shadower.
Revised January 4, 2004
Accepted January 20, 2004
Article
Multiple-sequence functional annotation and the generalized hidden Markov phylogeny
2 2Department of Mathematics, University of California, 970 Evans Hall, Berkeley, CA 94720, USA
3 1Department of Statistics, University of California, 367 Evans Hall, Berkeley, CA 94720, USA; 3Division of Computer Science, University of California, 387 Soda Hall, Berkeley, CA 94720, USA
![]()
Abstract ![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
W. H. Majoros and U. Ohler Complexity reduction in context-dependent DNA substitution models Bioinformatics, January 15, 2009; 25(2): 175 - 182. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. DeCaprio, J. P. Vinson, M. D. Pearson, P. Montgomery, M. Doherty, and J. E. Galagan Conrad: Gene prediction using conditional random fields Genome Res., September 1, 2007; 17(9): 1389 - 1398. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Bofkin and N. Goldman Variation in Evolutionary Processes at Different Codon Positions Mol. Biol. Evol., February 1, 2007; 24(2): 513 - 521. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Beerenwinkel and M. Drton A mutagenetic tree hidden Markov model for longitudinal clonal HIV sequence data Biostat., January 1, 2007; 8(1): 53 - 71. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. N. Dewey and L. Pachter Evolution at the nucleotide level: the problem of multiple whole-genome alignment. Hum. Mol. Genet., April 15, 2006; 15(suppl_1): R51 - R56. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. C. King, J. Taylor, L. Elnitski, F. Chiaromonte, W. Miller, and R. C. Hardison Evaluation of regulatory potential and conservation scores for detecting cis-regulatory modules in aligned mammalian genome sequences Genome Res., August 1, 2005; 15(8): 1051 - 1060. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. D. McAuliffe, M. I. Jordan, and L. Pachter Subtree power analysis and species selection for comparative genomics PNAS, May 31, 2005; 102(22): 7900 - 7905. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. H. Brown, S. S. Gross, and M. R. Brent Begin at the beginning: Predicting genes with 5' UTRs Genome Res., May 1, 2005; 15(5): 742 - 747. [Abstract] [Full Text] [PDF] |
||||





