Improved sensitivity of profile searches through the use of sequence weights and gap excision
European Molecular Biology Laboratory Postfach 102209, Meyerhofstrasse 1, 69012 Heidelberg, Germany
Position-specific substitution matrices, known as profiles, derived from multiple sequence alignments are currently used to search sequence databases for distantly related members of protein families. The performance of the database searches is enhanced by using (i) a sequence weighting scheme which assigns higher weights to more distantly related sequences based on branch lengths derived from phylogenetic trees, (ii) exclusion of positions with mainly padding characters at sites of insertions or deletions and (iii) the BLOSUM62 residue comparison matrix. A natural consequence of these modifications is an improvement in the alignment of new sequences to the profiles. However, the accuracy of the alignments can be further increased by employing a similarity residue comparison matrix. These developments are implemented in a program called PROFILEWEIGHT which runs on Unix and Vax computers. The only input required by the program is the multiple sequence alignment. The output from PROFILEWEIGHT is a profile designed to be used by existing searching and alignment programs. Test results from database searches with four different families of proteins show the improved sensitivity of the weighted profiles.
Received on April 30, 1993; accepted on September 30, 1993
This article has been cited by other articles:
![]() |
T. Nakatsuka, K. Sato, H. Takahashi, S. Yamamura, and M. Nishihara Cloning and characterization of the UDP-glucose:anthocyanin 5-O-glucosyltransferase gene from blue-flowered gentian J. Exp. Bot., April 1, 2008; 59(6): 1241 - 1252. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. A. Peterson, T. N. Nelson, A. J. Kanack, and R. H. Aster Fine specificity of drug-dependent antibodies reactive with a restricted domain of platelet GPIIIA Blood, February 1, 2008; 111(3): 1234 - 1239. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. Lee, K.-S. Choi, J. Riddell, C. Ip, D. Ghosh, J.-H. Park, and Y.-M. Park Human Peroxiredoxin 1 and 2 Are Not Duplicate Proteins: THE UNIQUE PRESENCE OF CYS83 IN Prx1 UNDERSCORES THE STRUCTURAL AND FUNCTIONAL DIFFERENCES BETWEEN Prx1 AND Prx2 J. Biol. Chem., July 27, 2007; 282(30): 22011 - 22022. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Budagyan and R. Abagyan Weighted quality estimates in machine learning Bioinformatics, November 1, 2006; 22(21): 2597 - 2603. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Dalli, A. Wilm, I. Mainz, and G. Steger STRAL: progressive alignment of non-coding RNA using base pairing probability vectors in quadratic time Bioinformatics, July 1, 2006; 22(13): 1593 - 1599. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. M. Wallace, O. O'Sullivan, D. G. Higgins, and C. Notredame M-Coffee: combining multiple sequence alignment methods with T-Coffee Nucleic Acids Res., March 23, 2006; 34(6): 1692 - 1699. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. DING Statistical and Bayesian approaches to RNA secondary structure prediction. RNA, March 1, 2006; 12(3): 323 - 331. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Ghaffari, D. L. Tuttle, D. Briggs, B. R. Burkhardt, D. Bhatt, W. A. Andiman, J. W. Sleasman, and M. M. Goodenow Complex Determinants in Human Immunodeficiency Virus Type 1 Envelope gp120 Mediate CXCR4-Dependent Infection of Macrophages J. Virol., November 1, 2005; 79(21): 13250 - 13261. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Meller, S. Merlot, and C. Guda CZH proteins: a new family of Rho-GEFs J. Cell Sci., November 1, 2005; 118(21): 4937 - 4946. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. R. Johnston and D. C. Shields A sequence sub-sampling algorithm increases the power to detect distant homologues Nucleic Acids Res., July 8, 2005; 33(12): 3772 - 3778. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Cohen-Zinder, E. Seroussi, D. M. Larkin, J. J. Loor, A. E.-v. d. Wind, J.-H. Lee, J. K. Drackley, M. R. Band, A.G. Hernandez, M. Shani, et al. Identification of a missense mutation in the bovine ABCG2 gene with a major effect on the QTL on chromosome 6 affecting milk yield and composition in Holstein cattle Genome Res., July 1, 2005; 15(7): 936 - 944. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. Quevillon, V. Silventoinen, S. Pillai, N. Harte, N. Mulder, R. Apweiler, and R. Lopez InterProScan: protein domains identifier Nucleic Acids Res., July 1, 2005; 33(suppl_2): W116 - W120. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. J. van Schaik, C. L. Giltner, G. F. Audette, D. W. Keizer, D. L. Bautista, C. M. Slupsky, B. D. Sykes, and R. T. Irvin DNA Binding: a Novel Function of Pseudomonas aeruginosa Type IV Pili J. Bacteriol., February 15, 2005; 187(4): 1455 - 1464. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. LI, W. G. KELLY, J. M. LOGSDON JR, A. M. SCHURKO, B. D. HARFE, K. L. HILL-HARFE, and R. A. KAHN Functional genomic analysis of the ADP-ribosylation factor family of GTPases: phylogeny among diverse eukaryotes and function in C. elegans FASEB J, December 1, 2004; 18(15): 1834 - 1850. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. DICHTL, R. AASLAND, and W. KELLER Functions for S. cerevisiae Swd2p in 3' end formation of specific mRNAs and snoRNAs and global histone 3 lysine 4 methylation RNA, June 1, 2004; 10(6): 965 - 977. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Gubser, S. Hue, P. Kellam, and G. L. Smith Poxvirus genomes: a phylogenetic analysis J. Gen. Virol., January 1, 2004; 85(1): 105 - 117. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Ziebuhr, S. Bayer, J. A. Cowley, and A. E. Gorbalenya The 3C-Like Proteinase of an Invertebrate Nidovirus Links Coronavirus and Potyvirus Homologs J. Virol., December 20, 2002; 77(2): 1415 - 1426. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Antonsson and B. G. Hansson Healthy Skin of Many Animal Species Harbors Papillomaviruses Which Are Closely Related to Their Human Counterparts J. Virol., November 13, 2002; 76(24): 12537 - 12542. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Gubser and G. L. Smith The sequence of camelpox virus shows it is most closely related to variola virus, the cause of smallpox J. Gen. Virol., April 1, 2002; 83(4): 855 - 872. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Hegyi, A. Friebe, A. E. Gorbalenya, and J. Ziebuhr Mutational analysis of the active centre of coronavirus 3C-like proteases J. Gen. Virol., March 1, 2002; 83(3): 581 - 593. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Falquet, M. Pagni, P. Bucher, N. Hulo, C. J. A. Sigrist, K. Hofmann, and A. Bairoch The PROSITE database, its status in 2002 Nucleic Acids Res., January 1, 2002; 30(1): 235 - 238. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Wicker, G. Rene Perrin, J. C. Thierry, and O. Poch Secator: A Program for Inferring Protein Subfamilies from Phylogenetic Trees Mol. Biol. Evol., August 1, 2001; 18(8): 1435 - 1441. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. A. Schaffer, L. Aravind, T. L. Madden, S. Shavirin, J. L. Spouge, Y. I. Wolf, E. V. Koonin, and S. F. Altschul Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements Nucleic Acids Res., July 15, 2001; 29(14): 2994 - 3005. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. C. W. May Optimal classification of protein sequences and selection of representative sets from multiple alignments: application to homologous families and lessons for structural genomics Protein Eng. Des. Sel., April 1, 2001; 14(4): 209 - 217. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Antonsson, O. Forslund, H. Ekberg, G. Sterner, and B. G. Hansson The Ubiquity and Impressive Genomic Diversity of Human Skin Papillomaviruses Suggest a Commensalic Nature of These Viruses J. Virol., December 15, 2000; 74(24): 11636 - 11641. [Abstract] [Full Text] |
||||
![]() |
H. Sultmann, A. Sato, B. W. Murray, N. Takezaki, R. Geisler, G.-J. Rauch, and J. Klein Conservation of Mhc Class III Region Synteny Between Zebrafish and Human as Determined by Radiation Hybrid Mapping J. Immunol., December 15, 2000; 165(12): 6984 - 6993. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Ohya, S. Maki, Y. Kawasaki, and A. Sugino Structure and function of the fourth subunit (Dpb4p) of DNA polymerase {varepsilon} in Saccharomyces cerevisiae Nucleic Acids Res., October 15, 2000; 28(20): 3846 - 3852. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. L. G. Cirillo, J. Lum, and J. D. Cirillo Identification of novel loci involved in entry by Legionella pneumophila Microbiology, June 1, 2000; 146(6): 1345 - 1359. [Abstract] [Full Text] |
||||
![]() |
J. Ziebuhr, E. J. Snijder, and A. E. Gorbalenya Virus-encoded proteinases and proteolytic processing in the Nidovirales J. Gen. Virol., April 1, 2000; 81(4): 853 - 879. [Full Text] |
||||
![]() |
M. Schweizer, H. Schleer, M. Pietrek, J. Liegibel, V. Falcone, and D. Neumann-Haefelin Genetic Stability of Foamy Viruses: Long-Term Study in an African Green Monkey Population J. Virol., November 1, 1999; 73(11): 9256 - 9265. [Abstract] [Full Text] [PDF] |
||||
![]() |
O. A. Økstad, M. Gominet, B. Purnelle, M. Rose, D. Lereclus, and A.-B. Kolstø Sequence analysis of three Bacillus cereus loci carrying PlcR-regulated genes encoding degradative enzymes and enterotoxin Microbiology, November 1, 1999; 145(11): 3129 - 3138. [Abstract] [Full Text] |
||||
![]() |
H. Ikeda, T. Nonomiya, M. Usami, T. Ohta, and S. Omura Organization of the biosynthetic gene cluster for the polyketide anthelmintic macrolide avermectin in Streptomyces avermitilis PNAS, August 17, 1999; 96(17): 9509 - 9514. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. R. Sunyaev, F. Eisenhaber, I. V. Rodchenkov, B. Eisenhaber, V. G. Tumanyan, and E. N. Kuznetsov PSIC: profile extraction from sequence alignments with position-specific counts of independent observations Protein Eng. Des. Sel., May 1, 1999; 12(5): 387 - 394. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Celerin, A. A. Gilpin, N. J. Schisler, A. G. Ivanov, E. Miskiewicz, M. Krol, and D. E. Laudenbach ClpB in a Cyanobacterium: Predicted Structure, Phylogenetic Relationships, and Regulation by Light and Temperature J. Bacteriol., October 1, 1998; 180(19): 5173 - 5182. [Abstract] [Full Text] |
||||
![]() |
G. H. Reubel, R. B. Kimsey, J. E. Barlough, and J. E. Madigan Experimental Transmission of Ehrlichia equi to Horses through Naturally Infected Ticks (Ixodes pacificus) from Northern California J. Clin. Microbiol., July 1, 1998; 36(7): 2131 - 2134. [Abstract] [Full Text] |
||||
![]() |
E. J. Snijder, A. L. M. Wassenaar, W. J. M. Spaan, and A. E. Gorbalenya The Arterivirus Nsp2 Protease J. Biol. Chem., July 14, 1995; 270(28): 16671 - 16676. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Guimaraes, J. Bazan, A Zlotnik, M. Wiles, J. Grimaldi, F Lee, and T McClanahan A new approach to the study of haematopoietic development in the yolk sac and embryoid bodies Development, January 10, 1995; 121(10): 3335 - 3346. [Abstract] [PDF] |
||||


















