Bioinformatics Advance Access published online on March 3, 2005
Bioinformatics, doi:10.1093/bioinformatics/bti363
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
1 Department of Statistics Texas A&M University College Station, TX 77843-3143, USA
* To whom correspondence should be addressed.
Motivation: We wish to predict protein inter-domain linker regions using sequence alone, without requiring known homology. Identifying linker regions will delineate domain boundaries, and can be used to computationally dissect proteins into domains prior to clustering them into families. We develop a hidden Markov model (HMM) of linker/non-linker sequence regions using a linker index derived from amino acid propensity. We employ an efficient Bayesian estimation of the model using Markov Chain Monte Carlo (MCMC), particularly Gibbs sampling, to simulate parameters from the posteriors. Our model recognizes sequence data to be continuous rather than categorical, and generates a probabilistic output. Results: We applied our method to a dataset of protein sequences in which domains and inter-domain linkers had been delineated using the Pfam-A database. The prediction results are superior to a simpler method that also uses linker index. Supplementary Information: http://racerx00.tamu.edu/kbae.
Received November 22, 2004
Revised February 9, 2005
Accepted February 26, 2005
Article
Prediction of protein inter-domain linker regions by a hidden Markov model
2 Department of Animal Science and Intercollegiate Faculty of Genetics, Texas A&M University, College Station, TX 77843-2471, USA
Christine G. Elsik, E-mail: c-elsik{at}tamu.edu
![]()
Abstract ![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
C. N.I. Pang, K. Lin, M. A. Wouters, J. Heringa, and R. A. George Identifying foldable regions in protein sequence from the hydrophobic signal Nucleic Acids Res., February 2, 2008; 36(2): 578 - 588. [Abstract] [Full Text] [PDF] |
||||
