Bioinformatics Advance Access published online on April 8, 2004
Bioinformatics, doi:10.1093/bioinformatics/bth235
Bioinformatics © Oxford University Press 2004; all rights reserved
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
1 Department of Genetics, Washington University School of Medicine, St Louis, Missouri 63110 USA
* To whom correspondence should be addressed. E-mail: mona{at}cs.princeton.edu.
Motivation: Knowledge of how proteomic amino acid composition has changed over time is important for constructing realistic models of protein evolution and increasing our understanding of molecular evolutionary history. The proteomic amino acid composition of the Last Universal Ancestor of life (LUA) is of particular interest, since that might provide insight into the early evolution of proteins and the nature of the LUA itself. Results: We introduce a method to estimate ancestral amino acid composition that is based on expectation-maximization (EM). On simulated data, the approach was found to be very effective in estimating ancestral amino acid composition, with accuracy improving as the number of residues in the dataset was increased. The method was then used to infer the amino acid composition of a set of proteins in the LUA. In general, as compared with the modern protein set, LUA proteins were found to be richer in amino acids that are believed to have been most abundant in the prebiotic environment and poorer in those believed to have been unavailable or scarce. Additionally, we found the inferred amino acid composition of this protein set in the LUA to be more similar to the observed composition of the same set in extant thermophilic species than in extant mesophilic species, supporting the idea that the LUA lived in a thermophilic environment. Availability: The program is available upon request.
Revised March 5, 2004
Accepted March 23, 2004
Article
A novel method for estimating ancestral amino acid composition and its application to proteins of the Last Universal Ancestor
2 Department of Molecular Biology, Princeton University, Princeton, New Jersey 08544 USA
3 Department of Computer Science and the Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, New Jersey 08544 USA
![]()
Abstract ![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
S. Itzkovitz and U. Alon The genetic code is nearly optimal for allowing additional information within protein-coding sequences Genome Res., April 1, 2007; 17(4): 405 - 412. [Abstract] [Full Text] [PDF] |
||||
![]() |
V. Gowri-Shankar and M. Rattray On the Correlation Between Composition and Site-Specific Evolutionary Rate: Implications for Phylogenetic Inference Mol. Biol. Evol., February 1, 2006; 23(2): 352 - 364. [Abstract] [Full Text] [PDF] |
||||

