The rapid generation of mutation data matrices from protein sequences
1Biomolecular Structure and Modelling Unit, Department of Biochemistry and Molecular Biology, University College Gower Street, London WCIE 6BT
2Laboratory of Mathematical Bioiogy, National Institute for Medical Research The Ridgeway. Mill Hill, London, NW7 IAA. UK
An efficient means for generating mutation data matrices from large numbers of protein sequences is presented here. By means of an approximate peptide-based sequence comparison algorithm, the set sequences are clustered at the 85% identity level. The closest relating pairs of sequences are aligned, and observed amino acid exchanges tallied in a matrix. The raw mutation frequency matrix is processed in a similar way to that described by Dayhoffet al. (1978), and so the resulting matrices may be easily used in current sequence analysis applications, in place of the standard mutation data matrices, which have not been updated for 13 years. The method is fast enough to process the entire SWISS-PROT databank in 20 h on a Sun SPARCstation 1, and is fast enough to generate a matrix from a specific family or class of proteins in minutes. Differences observed between our 250 PAM mutation data matrix and the matrix calculated by Dayhoff et al. are briefly discussed.
Received on October 21, 1991; accepted on December 6, 1991
This article has been cited by other articles:
![]() |
S. L. Kosakovsky Pond, A. F.Y. Poon, A. J. Leigh Brown, and S. D.W. Frost A Maximum Likelihood Method for Detecting Directional Evolution in Protein Sequences and Its Application to Influenza A Virus Mol. Biol. Evol., September 1, 2008; 25(9): 1809 - 1824. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Cadel-Six, C. Dauga, A. M. Castets, R. Rippka, C. Bouchier, N. Tandeau de Marsac, and M. Welker Halogenase Genes in Nonribosomal Peptide Synthetase Gene Clusters of Microcystis (Cyanobacteria): Sporadic Distribution and Evolution Mol. Biol. Evol., September 1, 2008; 25(9): 2031 - 2041. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. van den Born, M. V. Omelchenko, A. Bekkelund, V. Leihne, E. V. Koonin, V. V. Dolja, and P. O. Falnes Viral AlkB proteins repair RNA damage by oxidative demethylation Nucleic Acids Res., August 21, 2008; (2008) gkn519v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Najmanovich, N. Kurbatova, and J. Thornton Detection of 3D atomic similarities and their use in the discrimination of small molecule protein-binding sites Bioinformatics, August 15, 2008; 24(16): i105 - i111. [Abstract] [PDF] |
||||
![]() |
K. A. Grabinska, S. K. Ghosh, Z. Guan, J. Cui, C. R. H. Raetz, P. W. Robbins, and J. Samuelson Dolichyl-Phosphate-Glucose Is Used To Make O-Glycans on Glycoproteins of Trichomonas vaginalis Eukaryot. Cell, August 1, 2008; 7(8): 1344 - 1351. [Abstract] [Full Text] [PDF] |
||||
![]() |
Z. Xie, X. Li, B. J. Glover, S. Bai, G.-Y. Rao, J. Luo, and J. Yang Duplication and Functional Diversification of HAP3 Genes Leading to the Origin of the Seed-Developmental Regulatory Gene, LEAFY COTYLEDON1 (LEC1), in Nonseed Plant Genomes Mol. Biol. Evol., August 1, 2008; 25(8): 1581 - 1592. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Liu, K. S. Matthews, and S. E. Bondos Multiple Intrinsically Disordered Sequences Alter DNA Binding by the Homeodomain of the Drosophila Hox Protein Ultrabithorax J. Biol. Chem., July 25, 2008; 283(30): 20874 - 20887. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Katoh and H. Toh Recent developments in the MAFFT multiple sequence alignment program Brief Bioinform, July 1, 2008; 9(4): 286 - 298. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Shirai, Y. Tomaru, Y. Takao, H. Suzuki, T. Nagumo, and K. Nagasaki Isolation and Characterization of a Single-Stranded RNA Virus Infecting the Marine Planktonic Diatom Chaetoceros tenuissimus Meunier Appl. Envir. Microbiol., July 1, 2008; 74(13): 4022 - 4027. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Q. Le and O. Gascuel An Improved General Amino Acid Replacement Matrix Mol. Biol. Evol., July 1, 2008; 25(7): 1307 - 1320. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Madaoui and R. Guerois Coevolution at protein complex interfaces can be detected by the complementarity trace with important impact for predictive docking PNAS, June 3, 2008; 105(22): 7708 - 7713. [Abstract] [Full Text] [PDF] |
||||
![]() |
M.-C. Domingo, A. Huletsky, M. Boissinot, K. A. Bernard, F. J. Picard, and M. G. Bergeron Ruminococcus gauvreauii sp. nov., a glycopeptide-resistant species isolated from a human faecal specimen Int J Syst Evol Microbiol, June 1, 2008; 58(6): 1393 - 1397. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. W. Mount Using PAM Matrices in Sequence Alignments CSH Protocols, June 1, 2008; 2008(7): pdb.top38 - pdb.top38. [Abstract] [Full Text] |
||||
![]() |
E. L. Martin-Tryon and S. L. Harmer XAP5 CIRCADIAN TIMEKEEPER Coordinates Light Signals for Proper Timing of Photomorphogenesis and the Circadian Clock in Arabidopsis PLANT CELL, May 1, 2008; 20(5): 1244 - 1259. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Blanquart and N. Lartillot A Site- and Time-Heterogeneous Model of Amino Acid Replacement Mol. Biol. Evol., May 1, 2008; 25(5): 842 - 858. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. F. Bolliger, J. Pei, S. Maxeiner, A. A. Boucard, N. V. Grishin, and T. C. Sudhof Unusually rapid evolution of Neuroligin-4 in mice PNAS, April 29, 2008; 105(17): 6421 - 6426. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Liu, R. Tewari, J. Ning, A. M. Blagborough, S. Garbom, J. Pei, N. V. Grishin, R. E. Steele, R. E. Sinden, W. J. Snell, et al. The conserved plant sterility gene HAP2 functions after attachment of fusogenic membranes in Chlamydomonas and Plasmodium gametes Genes & Dev., April 15, 2008; 22(8): 1051 - 1068. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Leclerque and R. G. Kleespies 16S rRNA-, GroEL- and MucZ-based assessment of the taxonomic position of 'Rickettsiella melolonthae' and its implications for the organization of the genus Rickettsiella Int J Syst Evol Microbiol, April 1, 2008; 58(4): 749 - 755. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. Ehlers, G. Dural, N. Yasmum, T. Lembo, B. de Thoisy, M.-P. Ryser-Degiorgis, R. G. Ulrich, and D. J. McGeoch Novel Mammalian Herpesviruses and Lineages within the Gammaherpesvirinae: Cospeciation and Interspecies Transfer J. Virol., April 1, 2008; 82(7): 3509 - 3516. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. C. Fulton, M. Stettler, T. Mettler, C. K. Vaughan, J. Li, P. Francisco, M. Gil, H. Reinhold, S. Eicke, G. Messerli, et al. {beta}-AMYLASE4, a Noncatalytic Protein Required for Starch Breakdown, Acts Upstream of Three Active {beta}-Amylases in Arabidopsis Chloroplasts PLANT CELL, April 1, 2008; 20(4): 1040 - 1058. [Abstract] [Full Text] [PDF] |
||||
![]() |
O. Deusch, G. Landan, M. Roettger, N. Gruenheit, K. V. Kowallik, J. F. Allen, W. Martin, and T. Dagan Genes of Cyanobacterial Origin in Plant Nuclear Genomes Point to a Heterocyst-Forming Plastid Ancestor Mol. Biol. Evol., April 1, 2008; 25(4): 748 - 761. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. W. Mount Distance Methods for Phylogenetic Prediction CSH Protocols, April 1, 2008; 2008(5): pdb.top33 - pdb.top33. [Abstract] [Full Text] |
||||
![]() |
J. W. Edmonds, N. B. Weston, S. B. Joye, and M. A. Moran Variation in Prokaryotic Community Composition as a Function of Resource Availability in Tidal Creek Sediments Appl. Envir. Microbiol., March 15, 2008; 74(6): 1836 - 1844. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Coute, K. Kindbeiter, S. Belin, R. Dieckmann, L. Duret, L. Bezin, J.-C. Sanchez, and J.-J. Diaz ISG20L2, a Novel Vertebrate Nucleolar Exoribonuclease Involved in Ribosome Biogenesis Mol. Cell. Proteomics, March 1, 2008; 7(3): 546 - 559. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. J. Lercher and C. Pal Integration of Horizontally Transferred Genes into Regulatory Interaction Networks Takes Many Million Years Mol. Biol. Evol., March 1, 2008; 25(3): 559 - 567. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Gioti, J. M. Pradier, E. Fournier, P. Le Pecheur, C. Giraud, D. Debieu, J. Bach, P. Leroux, and C. Levis A Botrytis cinerea Emopamil Binding Domain Protein, Required for Full Virulence, Belongs to a Eukaryotic Superfamily Which Has Expanded in Euascomycetes Eukaryot. Cell, February 1, 2008; 7(2): 368 - 378. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Tartari, C. Gissi, V. Lo Sardo, C. Zuccato, E. Picardi, G. Pesole, and E. Cattaneo Phylogenetic Comparison of Huntingtin Homologues Reveals the Appearance of a Primitive polyQ in Sea Urchin Mol. Biol. Evol., February 1, 2008; 25(2): 330 - 338. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. S. Horner, W. Pirovano, and G. Pesole Correlated substitution analysis and the prediction of amino acid structural contacts Brief Bioinform, January 1, 2008; 9(1): 46 - 56. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Hamaji, P. J. Ferris, A. W. Coleman, S. Waffenschmidt, F. Takahashi, I. Nishii, and H. Nozaki Identification of the Minus-Dominance Gene Ortholog in the Mating-Type Locus of Gonium pectorale Genetics, January 1, 2008; 178(1): 283 - 294. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. L. Ring and A. R. O. Cavalcanti Consequences of Stop Codon Reassignment on Protein Evolution in Ciliates with Alternative Genetic Codes Mol. Biol. Evol., January 1, 2008; 25(1): 179 - 186. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. W. Pinney, G. D. Amoutzias, M. Rattray, and D. L. Robertson Reconstruction of ancestral protein interaction networks for the bZIP transcription factors PNAS, December 18, 2007; 104(51): 20449 - 20453. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. S. Oberste, K. Maher, and M. A. Pallansch Complete genome sequences for nine simian enteroviruses J. Gen. Virol., December 1, 2007; 88(12): 3360 - 3372. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. J. Cornell, I. Alam, D. M. Soanes, H. M. Wong, C. Hedeler, N. W. Paton, M. Rattray, S. J. Hubbard, N. J. Talbot, and S. G. Oliver Comparative genome analysis across a kingdom of eukaryotic organisms: Specialization and diversification in the Fungi Genome Res., December 1, 2007; 17(12): 1809 - 1822. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. D. Rasmussen and M. Kellis Accurate gene-tree reconstruction by learning gene- and species-specific substitution rates across multiple complete genomes Genome Res., December 1, 2007; 17(12): 1932 - 1942. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Lepage, D. Bryant, H. Philippe, and N. Lartillot A General Comparison of Relaxed Molecular Clock Models Mol. Biol. Evol., December 1, 2007; 24(12): 2669 - 2680. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. T. Saunders and P. Green Insights from Modeling Protein Evolution with Context-Dependent Mutation and Asymmetric Amino Acid Selection Mol. Biol. Evol., December 1, 2007; 24(12): 2632 - 2647. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. V. Goldstone, H. M. H. Goldstone, A. M. Morrison, A. Tarrant, S. E. Kern, B. R. Woodin, and J. J. Stegeman Cytochrome P450 1 Genes in Early Deuterostomes (Tunicates and Sea Urchins) and Vertebrates (Chicken and Frog): Origin and Diversification of the CYP1 Gene Family Mol. Biol. Evol., December 1, 2007; 24(12): 2619 - 2631. [Abstract] [Full Text] [PDF] |
||||
![]() |
S.-B. Malik, M. A. Ramesh, A. M. Hulstrand, and J. M. Logsdon Jr. Protist Homologs of the Meiotic Spo11 Gene and Topoisomerase VI reveal an Evolutionary History of Gene Duplication and Lineage-Specific Loss Mol. Biol. Evol., December 1, 2007; 24(12): 2827 - 2841. [Abstract] [Full Text] [PDF] |
||||
![]() |
M.-C. Domingo, A. Huletsky, R. Giroux, F. J. Picard, and M. G. Bergeron vanD and vanG-Like Gene Clusters in a Ruminococcus Species Isolated from Human Bowel Flora Antimicrob. Agents Chemother., November 1, 2007; 51(11): 4111 - 4117. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. M. Fritzler, J. J. Millership, and G. Zhu Cryptosporidium parvum Long-Chain Fatty Acid Elongase Eukaryot. Cell, November 1, 2007; 6(11): 2018 - 2028. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. A. Johnson and M. A. Thomas The Monosaccharide Transporter Gene Family in Arabidopsis and Rice: A History of Duplications, Adaptive Evolution, and Functional Divergence Mol. Biol. Evol., November 1, 2007; 24(11): 2412 - 2423. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Ahlenstiel, K. Roomp, M. Daumer, J. Nattermann, M. Vogel, J. K. Rockstroh, N. Beerenwinkel, R. Kaiser, H.-D. Nischalke, T. Sauerbruch, et al. Selective Pressures of HLA Genotypes and Antiviral Therapy on Human Immunodeficiency Virus Type 1 Sequence Mutation at a Population Level Clin. Vaccine Immunol., October 1, 2007; 14(10): 1266 - 1273. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Jia, M. Omelchenko, D. Garland, V. Vasiliou, J. Kanungo, M. Spencer, Y. Wolf, E. Koonin, and J. Piatigorsky Duplicated gelsolin family genes in zebrafish: a novel scinderin-like gene (scinla) encodes the major corneal crystallin FASEB J, October 1, 2007; 21(12): 3318 - 3328. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Hirano, M. Nakajima, K. Asano, T. Nishiyama, H. Sakakibara, M. Kojima, E. Katoh, H. Xiang, T. Tanahashi, M. Hasebe, et al. The GID1-Mediated Gibberellin Perception Mechanism Is Conserved in the Lycophyte Selaginella moellendorffii but Not in the Bryophyte Physcomitrella patens PLANT CELL, October 1, 2007; 19(10): 3058 - 3079. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Massingham and N. Goldman Statistics of the Log-Det Estimator Mol. Biol. Evol., October 1, 2007; 24(10): 2277 - 2285. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. L. Torres and G. L. Salerno A metabolic pathway leading to mannosylfructose biosynthesis in Agrobacterium tumefaciens uncovers a family of mannosyltransferases PNAS, September 4, 2007; 104(36): 14318 - 14323. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. R. Zamudio, B. Mittra, S. Foldynova-Trantirkova, G. M. Zeiner, J. Lukes, J. M. Bujnicki, N. R. Sturm, and D. A. Campbell The 2'-O-Ribose Methyltransferase for Cap 1 of Spliced Leader RNA and U1 Small Nuclear RNA in Trypanosoma brucei Mol. Cell. Biol., September 1, 2007; 27(17): 6084 - 6092. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Yasui, E.-Y. Kim, H. Iwata, D. G. Franks, S. I. Karchner, M. E. Hahn, and S. Tanabe Functional Characterization and Evolutionary History of Two Aryl Hydrocarbon Receptor Isoforms (AhR1 and AhR2) from Avian Species Toxicol. Sci., September 1, 2007; 99(1): 101 - 117. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. L. Davies, J. A. Cowing, L. S. Carvalho, I. C. Potter, A. E. O. Trezise, D. M. Hunt, and S. P. Collin Functional characterization, tuning, and regulation of visual pigment gene expression in an anadromous lamprey FASEB J, September 1, 2007; 21(11): 2713 - 2724. [Abstract] [Full Text] [PDF] |
||||
![]() |
S.-Y. Cai, L. Xiong, C. G. Wray, N. Ballatori, and J. L. Boyer The farnesoid X receptor FXR{alpha}/NR1H4 acquired ligand specificity for bile salts late in vertebrate evolution Am J Physiol Regulatory Integrative Comp Physiol, September 1, 2007; 293(3): R1400 - R1409. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. Gitelman Evolution of the vertebrate twist family and synfunctionalization: a mechanism for differential gene loss through merging of expression domains Mol. Biol. Evol., September 1, 2007; 24(9): 1912 - 1925. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. H. Kloepper, C. N. Kienle, and D. Fasshauer An Elaborate Classification of SNARE Proteins Sheds Light on the Conservation of the Eukaryotic Endomembrane System Mol. Biol. Cell, September 1, 2007; 18(9): 3463 - 3471. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. C. Serrani, R. Sanjuan, O. Ruiz-Rivero, M. Fos, and J. L. Garcia-Martinez Gibberellin Regulation of Fruit Set and Growth in Tomato Plant Physiology, September 1, 2007; 145(1): 246 - 257. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. J. Gingerich, K. Hanada, S.-H. Shiu, and R. D. Vierstra Large-Scale, Lineage-Specific Expansion of a Bric-a-Brac/Tramtrack/Broad Complex Ubiquitin-Ligase Gene Family in Rice PLANT CELL, August 1, 2007; 19(8): 2329 - 2348. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Takehisa, M. H. Kraus, J. M. Decker, Y. Li, B. F. Keele, F. Bibollet-Ruche, K. P. Zammit, Z. Weng, M. L. Santiago, S. Kamenya, et al. Generation of Infectious Molecular Clones of Simian Immunodeficiency Virus from Fecal Consensus Sequences of Wild Chimpanzees J. Virol., July 15, 2007; 81(14): 7463 - 7475. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. Rockx, T. Sheahan, E. Donaldson, J. Harkema, A. Sims, M. Heise, R. Pickles, M. Cameron, D. Kelvin, and R. Baric Synthetic Reconstruction of Zoonotic and Early Human Severe Acute Respiratory Syndrome Coronavirus Isolates That Produce Fatal Disease in Aged Mice J. Virol., July 15, 2007; 81(14): 7410 - 7423. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Stern, A. Doron-Faigenboim, E. Erez, E. Martz, E. Bacharach, and T. Pupko Selecton 2007: advanced models for detecting positive and purifying selection using a Bayesian inference approach Nucleic Acids Res., July 13, 2007; 35(suppl_2): W506 - W511. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Banerjee, P. Vishwanath, J. Cui, D. J. Kelleher, R. Gilmore, P. W. Robbins, and J. Samuelson The evolution of N-glycan-dependent endoplasmic reticulum quality control factors for glycoprotein folding and degradation PNAS, July 10, 2007; 104(28): 11676 - 11681. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. Wapinski, A. Pfeffer, N. Friedman, and A. Regev Automatic genome-wide reconstruction of phylogenetic gene trees Bioinformatics, July 1, 2007; 23(13): i549 - i558. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Kosiol, I. Holmes, and N. Goldman An Empirical Codon Model for Protein Sequence Evolution Mol. Biol. Evol., July 1, 2007; 24(7): 1464 - 1479. |
























