Bioinformatics Vol. 16 no. 2 2000
Pages 152-158
© 2000 Oxford University Press
Modeling splice sites with Bayes networks
1 Department of Electrical Engineering and
Computer Science, University of Illinois, Chicago, IL 60607, USA
2 Department of Computer Science, Loyola
College in Maryland, Baltimore, MD 21210, USA andCelera Genomics,
Rockville, MD 20850, USA
Motivation: The main goal in this paper is to develop accurate probabilistic models for important functional regions in DNA sequences (e.g. splice junctions that signal the beginning and end of transcription in human DNA). These methods can subsequently be utilized to improve the performance of gene-finding systems. The models built here attempt to model long-distance dependencies between non-adjacent bases.
Results: An efficient modeling method is described which models biological data more accurately than a first-order Markov model without increasing the number of parameters. Intuitively, a small number of parameters helps a learning system to avoid overfitting. Several experiments with the model are presented, which show a small improvement in the average accuracy as compared with a simple Markov model. These experiments suggest that single long distance dependencies do not help the recognition problem, thus confirming several previous studies which have used more heuristic modeling techniques.
Availability: This software is available for download and as a web resource at http://www.ai.uic.edu/software
Contact: kasif{at}eecs.uic.edu
Received on November 6, 1998
; revised on April 23, 1999
; accepted on June 17, 1999
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
S. A. Assi, T. Tanaka, T. H. Rabbitts, and N. Fernandez-Fuentes PCRPi: Presaging Critical Residues in Protein interfaces, a new computational tool to chart hot spots in protein interfaces Nucleic Acids Res., December 11, 2009; (2009) gkp1158v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Nikolajewa, R. Pudimat, M. Hiller, M. Platzer, and R. Backofen BioBayesNet: a web server for feature extraction and Bayesian network modeling of biological sequence data Nucleic Acids Res., July 13, 2007; 35(suppl_2): W688 - W693. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Grau, I. Ben-Gal, S. Posch, and I. Grosse VOMBAT: prediction of transcription factor binding sites using variable order Bayesian trees. Nucleic Acids Res., July 1, 2006; 34(Web Server issue): W529 - W533. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Larranaga, B. Calvo, R. Santana, C. Bielza, J. Galdiano, I. Inza, J. A. Lozano, R. Armananzas, G. Santafe, A. Perez, et al. Machine learning in bioinformatics Brief Bioinform, March 1, 2006; 7(1): 86 - 112. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Pudimat, E.-G. Schukat-Talamazzini, and R. Backofen A multiple-feature framework for modelling and predicting transcription factor binding sites Bioinformatics, July 15, 2005; 21(14): 3082 - 3088. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. Ben-Gal, A. Shani, A. Gohr, J. Grau, S. Arviv, A. Shmilovici, S. Posch, and I. Grosse Identification of transcription factor binding sites with variable-order Bayesian networks Bioinformatics, June 1, 2005; 21(11): 2657 - 2666. [Abstract] [Full Text] [PDF] |
||||
![]() |
X. ROCA, R. SACHIDANANDAM, and A. R. KRAINER Determinants of the inherent strength of human 5' splice sites RNA, May 1, 2005; 11(5): 683 - 698. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Degroeve, Y. Saeys, B. De Baets, P. Rouze, and Y. Van de Peer SpliceMachine: predicting splice sites from high-dimensional local context representations Bioinformatics, April 15, 2005; 21(8): 1332 - 1338. [Abstract] [Full Text] [PDF] |
||||
![]() |
T.-M. Chen, C.-C. Lu, and W.-H. Li Prediction of splice sites with dependency graphs and their expanded bayesian networks Bioinformatics, February 15, 2005; 21(4): 471 - 482. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Zhang and L. Luo Splice site prediction with quadratic discriminant analysis using diversity measure Nucleic Acids Res., November 1, 2003; 31(21): 6214 - 6220. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Zhang, V. Pavlovic, C. R Cantor, and S. Kasif Human-Mouse Gene Identification by Comparative Evidence Integration and Evolutionary Analysis Genome Res., June 1, 2003; 13(6): 1190 - 1202. [Abstract] [Full Text] [PDF] |
||||




