Bioinformatics Advance Access originally published online on February 26, 2004
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Bioinformatics 20(12) © Oxford University Press 2004; all rights reserved.
Multiple-sequence functional annotation and the generalized hidden Markov phylogeny
1 Department of Statistics, University of California, 367 Evans Hall, Berkeley, CA 94720, USA, 2 Department of Mathematics, University of California, 970 Evans Hall, Berkeley, CA 94720, USA and 3 Division of Computer Science, University of California, 387 Soda Hall, Berkeley, CA 94720, USA
Received on September 20, 2003; revised on January 4, 2004; accepted on January 20, 2004
Advance Access Publication February 26, 2004
Motivation: Phylogenetic shadowing is a comparative genomics principle that allows for the discovery of conserved regions in sequences from multiple closely related organisms. We develop a formal probabilistic framework for combining phylogenetic shadowing with feature-based functional annotation methods. The resulting model, a generalized hidden Markov phylogeny (GHMP), applies to a variety of situations where functional regions are to be inferred from evolutionary constraints.
Results: We show how GHMPs can be used to predict complete shared gene structures in multiple primate sequences. We also describe SHADOWER, our implementation of such a prediction system. We find that SHADOWER outperforms previously reported ab initio gene finders, including comparative humanmouse approaches, on a small sample of diverse exonic regions. Finally, we report on an empirical analysis of SHADOWER's performance which reveals that as few as five well-chosen species may suffice to attain maximal sensitivity and specificity in exon demarcation.
Availability: A Web server is available at http://bonaire.lbl.gov/shadower
Contact: jordan{at}cs.berkeley.edu
* To whom correspondence should be addressed.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
W. H. Majoros and U. Ohler Complexity reduction in context-dependent DNA substitution models Bioinformatics, January 15, 2009; 25(2): 175 - 182. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. DeCaprio, J. P. Vinson, M. D. Pearson, P. Montgomery, M. Doherty, and J. E. Galagan Conrad: Gene prediction using conditional random fields Genome Res., September 1, 2007; 17(9): 1389 - 1398. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Bofkin and N. Goldman Variation in Evolutionary Processes at Different Codon Positions Mol. Biol. Evol., February 1, 2007; 24(2): 513 - 521. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Beerenwinkel and M. Drton A mutagenetic tree hidden Markov model for longitudinal clonal HIV sequence data Biostat., January 1, 2007; 8(1): 53 - 71. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. N. Dewey and L. Pachter Evolution at the nucleotide level: the problem of multiple whole-genome alignment. Hum. Mol. Genet., April 15, 2006; 15(suppl_1): R51 - R56. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. C. King, J. Taylor, L. Elnitski, F. Chiaromonte, W. Miller, and R. C. Hardison Evaluation of regulatory potential and conservation scores for detecting cis-regulatory modules in aligned mammalian genome sequences Genome Res., August 1, 2005; 15(8): 1051 - 1060. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. D. McAuliffe, M. I. Jordan, and L. Pachter Subtree power analysis and species selection for comparative genomics PNAS, May 31, 2005; 102(22): 7900 - 7905. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. H. Brown, S. S. Gross, and M. R. Brent Begin at the beginning: Predicting genes with 5' UTRs Genome Res., May 1, 2005; 15(5): 742 - 747. [Abstract] [Full Text] [PDF] |
||||





