Bioinformatics Advance Access published online on November 5, 2004
Bioinformatics, doi:10.1093/bioinformatics/bti125
Bioinformatics © Oxford University Press 2004; all rights reserved
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
1 Department of Protein Evolution, Max-Planck-Institute for Developmental Biology, Spemannstrasse 35, D-72076 Tübingen, Germany
* To whom correspondence should be addressed.
Motivation: Protein homology detection and sequence alignment are at the basis of protein structure prediction, function prediction, and evolution. Results: We have generalized the alignment of protein sequences with a profile hidden Markov model (HMM) to the case of pairwise alignment of profile HMMs. We present a method for detecting distant homologous relationships between proteins based on this approach. The method (HHsearch) is benchmarked together with BLAST, PSI-BLAST, HMMER, and the profile-profile comparison tools PROF_SIM and COMPASS, in an all-against-all comparison of a database of 3691 protein domains from SCOP 1.63 with pairwise sequence identities below 20%. Sensitivity: When predicted secondary structure is included in the HMMs, HHsearch is able to detect between 2.7 and 4.2 times more homologs than PSI-BLAST or HMMER and between 1.44 and 1.9 times more than COMPASS or PROF_SIM for a rate of false positives of 10%. Approximately half of the improvement over the profile-profile comparison methods is attributable to the use of profile HMMs in place of simple profiles. Alignment quality: Higher sensitivity is mirrored by an increased alignment quality. HHsearch produced 1.2, 1.7, and 3.3 times more good alignments ("balanced" score > 0.3) than the next best method (COMPASS), and 1.6, 2.9, and 9.4 times more than PSI-BLAST, at the family, superfamily, and fold level. Speed: HHsearch scans a query of 200 residues against 3691 domains in 33s on an AMD64 3GHz PC. This is 10 times faster than PROF_SIM and 17 times faster than COMPASS. Availability: HHsearch can be downloaded from http://protevo.eb.tuebingen.mpg.de/download/ together with up-to-date versions of SCOP and PFAM. A web server is available at http://protevo.eb.tuebingen.mpg.de/toolkit/index.php?view=hhpred.
Revised October 18, 2004
Accepted November 2, 2004
Article
Protein homology detection by HMM-HMM comparison
Johannes Söding, E-mail: johannes.soeding{at}tuebingen.mpg.de
![]()
Abstract ![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
J. Meisner, X. Wang, M. Serrano, A. O. Henriques, and C. P. Moran Jr A channel connecting the mother cell and forespore during bacterial endospore formation PNAS, September 30, 2008; 105(39): 15100 - 15105. [Abstract] [Full Text] [PDF] |
||||
![]() |
X. Guo and X. Gao A novel hierarchical ensemble classifier for protein fold recognition Protein Eng. Des. Sel., September 4, 2008; (2008) gzn045v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Loewenstein and M. Linial Connect the dots: exposing hidden protein family connections from the entire sequence tree Bioinformatics, August 15, 2008; 24(16): i193 - i199. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Eisenbeis, S. Lohmiller, M. Valdebenito, S. Leicht, and V. Braun NagA-Dependent Uptake of N-Acetyl-Glucosamine and N-Acetyl-Chitin Oligosaccharides across the Outer Membrane of Caulobacter crescentus J. Bacteriol., August 1, 2008; 190(15): 5230 - 5238. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. E. Dutilh, B. Snel, T. J. G. Ettema, and M. A. Huynen Signature Genes as a Phylogenomic Tool Mol. Biol. Evol., August 1, 2008; 25(8): 1659 - 1667. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Roovers, K. H. Kaminska, K. L. Tkaczuk, D. Gigot, L. Droogmans, and J. M. Bujnicki The YqfN protein of Bacillus subtilis is the tRNA: m1A22 methyltransferase (TrmK) Nucleic Acids Res., June 1, 2008; 36(10): 3252 - 3262. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Orlowski and J. M. Bujnicki Structural and evolutionary classification of Type II restriction enzymes based on theoretical and experimental analyses Nucleic Acids Res., June 1, 2008; 36(11): 3552 - 3569. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. White, Z. Li, R. Sardana, J. M. Bujnicki, E. M. Marcotte, and A. W. Johnson Bud23 Methylates G1575 of 18S rRNA and Is Required for Efficient Nuclear Export of Pre-40S Subunits Mol. Cell. Biol., May 15, 2008; 28(10): 3151 - 3161. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Szczesny and A. Lupas Domain annotation of trimeric autotransporter adhesins--daTAA Bioinformatics, May 15, 2008; 24(10): 1251 - 1256. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Poleksic and M. Fienup Optimizing the size of the sequence profiles to increase the accuracy of protein sequence alignments generated by profile-profile algorithms Bioinformatics, May 1, 2008; 24(9): 1145 - 1153. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. I. Sadreyev and N. V. Grishin Accurate statistical model of comparison between multiple sequence alignments Nucleic Acids Res., April 1, 2008; 36(7): 2240 - 2248. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Wu and Y. Zhang A comprehensive assessment of sequence-based and template-based methods for protein contact prediction Bioinformatics, April 1, 2008; 24(7): 924 - 931. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. R. Shah, C. S. Oehmen, and B.-J. Webb-Robertson SVM-HUSTLE--an iterative semi-supervised machine learning approach for pairwise protein remote homology detection Bioinformatics, March 15, 2008; 24(6): 783 - 790. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Biegert and J. Soding De novo identification of highly diverged protein repeats by probabilistic consistency Bioinformatics, March 15, 2008; 24(6): 807 - 814. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Sundaram, B. Rathinasabapathi, L. Q. Ma, and B. P. Rosen An Arsenate-activated Glutaredoxin from the Arsenic Hyperaccumulator Fern Pteris vittata L. Regulates Intracellular Arsenite J. Biol. Chem., March 7, 2008; 283(10): 6095 - 6101. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. D. Finn, J. Tate, J. Mistry, P. C. Coggill, S. J. Sammut, H.-R. Hotz, G. Ceric, K. Forslund, S. R. Eddy, E. L. L. Sonnhammer, et al. The Pfam protein families database Nucleic Acids Res., January 11, 2008; 36(suppl_1): D281 - D288. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Michelsen, V. Schmid, J. Metz, K. Heusser, U. Liebel, T. Schwede, A. Spang, and B. Schwappach Novel cargo-binding site in the {beta} and {delta} subunits of coatomer J. Cell Biol., October 22, 2007; 179(2): 209 - 217. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. J. Reid, C. Yeats, and C. A. Orengo Methods of remote homology detection can be combined to increase coverage by 10% in the midnight zone Bioinformatics, September 15, 2007; 23(18): 2353 - 2360. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Heger, S. Mallick, C. Wilton, and L. Holm The global trace graph, a novel paradigm for searching protein sequence databases Bioinformatics, September 15, 2007; 23(18): 2361 - 2367. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Dez, M. Dlakic, and D. Tollervey Roles of the HEAT repeat proteins Utp10 and Utp20 in 40S ribosome maturation RNA, September 1, 2007; 13(9): 1516 - 1527. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. I. Sadreyev, M. Tang, B.-H. Kim, and N. V. Grishin COMPASS server for remote homology inference Nucleic Acids Res., July 13, 2007; 35(suppl_2): W653 - W658. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Cheng DOMAC: an accurate, hybrid protein domain prediction server Nucleic Acids Res., July 13, 2007; 35(suppl_2): W354 - W356. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Smits, J. A. M. Smeitink, L. P. van den Heuvel, M. A. Huynen, and T. J. G. Ettema Reconstructing the evolution of the mitochondrial ribosomal proteome Nucleic Acids Res., July 9, 2007; 35(14): 4686 - 4703. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Westphal, E. C. Ledgerwood, M. H. Hibma, S. B. Fleming, E. M. Whelan, and A. A. Mercer A Novel Bcl-2-Like Inhibitor of Apoptosis Is Encoded by the Parapoxvirus Orf Virus J. Virol., July 1, 2007; 81(13): 7178 - 7188. [Abstract] [Full Text] [PDF] |
||||
![]() |
Z. Yu, P.-A. Genest, B. ter Riet, K. Sweeney, C. DiPaolo, R. Kieft, E. Christodoulou, A. Perrakis, J. M. Simmons, R. P. Hausinger, et al. The protein that binds to DNA base J in trypanosomatids has features of a thymidine hydroxylase Nucleic Acids Res., April 1, 2007; 35(7): 2107 - 2115. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Sukackaite, A. Lagunavicius, K. Stankevicius, C. Urbanke, C. Venclovas, and V. Siksnys Restriction endonuclease BpuJI specific for the 5'-CCCGT sequence is related to the archaeal Holliday junction resolvase family Nucleic Acids Res., April 1, 2007; 35(7): 2377 - 2389. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Pei and N. V. Grishin PROMALS: towards accurate multiple sequence alignments of distantly related proteins Bioinformatics, April 1, 2007; 23(7): 802 - 808. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Bateman and R. D. Finn SCOOP: a simple method for identification of novel protein superfamily relationships Bioinformatics, April 1, 2007; 23(7): 809 - 814. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. A. S. Banks, S. E. Kong, H. Spahr, L. Florens, S. Martin-Brown, M. P. Washburn, J. W. Conaway, A. Mushegian, and R. C. Conaway Identification and Characterization of a Schizosaccharomyces pombe RNA Polymerase II Elongation Factor with Similarity to the Metazoan Transcription Factor ELL J. Biol. Chem., February 23, 2007; 282(8): 5761 - 5769. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. D. Silva, L. Shen, V. Tcherepanov, C. Watson, and C. Upton Predicted function of the vaccinia virus G5R protein Bioinformatics, December 1, 2006; 22(23): 2846 - 2850. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Dlakic DUF283 domain of Dicer proteins has a double-stranded RNA-binding fold Bioinformatics, November 15, 2006; 22(22): 2711 - 2714. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Boekhorst, M. Wels, M. Kleerebezem, and R. J. Siezen The predicted secretome of Lactobacillus plantarum WCFS1 sheds light on interactions with its environment. Microbiology, November 1, 2006; 152(Pt 11): 3175 - 3183. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Liao and M. Kielian Site-Directed Antibodies against the Stem Region Reveal Low pH-Induced Conformational Changes of the Semliki Forest Virus Fusion Protein J. Virol., October 1, 2006; 80(19): 9599 - 9607. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. S. A. Maaty, A. C. Ortmann, M. Dlakic, K. Schulstad, J. K. Hilmer, L. Liepold, B. Weidenheft, R. Khayat, T. Douglas, M. J. Young, et al. Characterization of the Archaeal Thermophile Sulfolobus Turreted Icosahedral Virus Validates an Evolutionary Link among Double-Stranded DNA Viruses from All Domains of Life. J. Virol., August 1, 2006; 80(15): 7625 - 7635. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Soding, M. Remmert, and A. Biegert HHrep: de novo protein repeat detection and the origin of TIM barrels. Nucleic Acids Res., July 1, 2006; 34(Web Server issue): W137 - W142. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Biegert, C. Mayer, M. Remmert, J. Soding, and A. N. Lupas The MPI Bioinformatics Toolkit for protein sequence analysis. Nucleic Acids Res., July 1, 2006; 34(Web Server issue): W335 - W339. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Soding, M. Remmert, A. Biegert, and A. N. Lupas HHsenser: exhaustive transitive profile search using HMM-HMM comparison. Nucleic Acids Res., July 1, 2006; 34(Web Server issue): W374 - W378. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Cheng and P. Baldi A machine learning information retrieval approach to protein fold recognition Bioinformatics, June 15, 2006; 22(12): 1456 - 1463. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. Becker, V. Meyer, H. Madaoui, and R. Guerois Detection of a tandem BRCT in Nbs1 and Xrs2 with functional implications in the DNA damage response Bioinformatics, June 1, 2006; 22(11): 1289 - 1292. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. W. Ginzinger and J. Fischer SimShift: Identifying structural similarities from NMR chemical shifts Bioinformatics, February 15, 2006; 22(4): 460 - 465. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Devos, S. Dokudovskaya, R. Williams, F. Alber, N. Eswar, B. T. Chait, M. P. Rout, and A. Sali Simple fold composition and modular architecture of the nuclear pore complex PNAS, February 14, 2006; 103(7): 2172 - 2177. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Boekhorst, Q. Helmer, M. Kleerebezem, and R. J. Siezen Comparative analysis of proteins with a mucus-binding domain found exclusively in lactic acid bacteria Microbiology, January 1, 2006; 152(1): 273 - 280. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Jin, Y. Cai, T. Yao, A. J. Gottschalk, L. Florens, S. K. Swanson, J. L. Gutierrez, M. K. Coleman, J. L. Workman, A. Mushegian, et al. A Mammalian Chromatin Remodeling Complex with Similarities to the Yeast INO80 Complex J. Biol. Chem., December 16, 2005; 280(50): 41207 - 41212. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Neugebauer, C. Herrmann, W. Kammer, G. Schwarz, A. Nordheim, and V. Braun ExbBD-Dependent Transport of Maltodextrins through the Novel MalA Protein across the Outer Membrane of Caulobacter crescentus J. Bacteriol., December 15, 2005; 187(24): 8300 - 8311. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Suhre Gene and Genome Duplication in Acanthamoeba polyphaga Mimivirus J. Virol., November 15, 2005; 79(22): 14095 - 14101. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Tilburn, J. C. Sanchez-Ferrero, E. Reoyo, H. N. Arst Jr., and M. A. Penalva Mutational Analysis of the pH Signal Transduction Component PalC of Aspergillus nidulans Supports Distant Similarity to BRO1 Domain Family Members Genetics, September 1, 2005; 171(1): 393 - 401. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Gibson, A. P. Lewis, K. Affleck, A. J. Aitken, E. Meldrum, and N. Thompson hCLCA1 and mCLCA3 Are Secreted Non-integral Membrane Proteins and Therefore Are Not Ion Channels J. Biol. Chem., July 22, 2005; 280(29): 27205 - 27212. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Soding, A. Biegert, and A. N. Lupas The HHpred interactive server for protein homology detection and structure prediction Nucleic Acids Res., July 1, 2005; 33(suppl_2): W244 - W248. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. Schuster-Bockler and A. Bateman Visualizing profile-profile alignment: pairwise HMM logos Bioinformatics, June 15, 2005; 21(12): 2912 - 2913. [Abstract] [Full Text] [PDF] |
||||












