Published by Oxford University Press 2005
A Gamma mixture model better accounts for among site rate heterogeneity
1Department of Cell Research and Immunology, George S. Wise Faculty of Life Sciences, Tel Aviv University Ramat Aviv 69978, Israel
2School of Computer Science and Engineering, Hebrew University Jerusalem 91904, Israel
*To whom correspondence should be addressed.
Motivation: Variation of substitution rates across nucleotide and amino acid sites has long been recognized as a characteristic of molecular sequence evolution. Evolutionary models that account for this rate heterogeneity usually use a gamma density function to model the rate distribution across sites. This density function, however, may not fit real datasets, especially when there is a multimodal distribution of rates. Here, we present a novel evolutionary model based on a mixture of gamma density functions. This model better describes the among-site rate variation characteristic of molecular sequence evolution. The use of this model may improve the accuracy of various phylogenetic methods, such as reconstructing phylogenetic trees, dating divergence events, inferring ancestral sequences and detecting conserved sites in proteins.
Results: Using diverse sets of protein sequences we show that the gamma mixture model better describes the stochastic process underlying protein evolution. We show that the proposed gamma mixture model fits protein datasets significantly better than the single-gamma model in 9 out of 10 datasets tested. We further show that using the gamma mixture model improves the accuracy of model-based prediction of conserved residues in proteins.
Availability: C++ source codes are available from the authors upon request.
Contact: talp{at}post.tau.ac.il
This article has been cited by other articles:
![]() |
A. Stern, A. Doron-Faigenboim, E. Erez, E. Martz, E. Bacharach, and T. Pupko Selecton 2007: advanced models for detecting positive and purifying selection using a Bayesian inference approach Nucleic Acids Res., July 13, 2007; 35(suppl_2): W506 - W511. [Abstract] [Full Text] [PDF] |
||||
