Skip Navigation

This Article
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Sjölander, K.
Right arrow Articles by Haussler, D.
Right arrow Search for Related Content
PubMed
Right arrow Articles by Sjölander, K.
Right arrow Articles by Haussler, D.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© Oxford University Press

Dirichlet mixtures: a method for improved detection of weak but significant protein sequence homology

Kimmen Sjölander 3, Kevin Karplus , Michael Brown , Richard Hughey , Anders Krogh 1, I.Saira Mian 2 and David Haussler

Baskin Center for Computer Engineering and Information Sciences, Applied Sciences Building, University of California at Santa Cruz Santa Cruz, CA 95064, USA
1The Sanger Centre, Hinxton Hall Hinxton, Cambs CB10 1RQ, UK
2Life Sciences Division (Mail Stop 29—100), Lawrence Berkeley Laboratory, University of California Berkeley, CA 94720, USA

1To whom correspondence should be addressed. E-mail: kimmen{at}cse.ucsc.edu

We present a method for condensing the information in multiple alignments of proteins into a mixture of Dirichlet densities over amino acid distributions. Dirichiet mixture densities are designed to be combined with observed amino acid frequencies to form estimates of expected amino acid probabilities at each position in a profile, hidden Markov model or other statistical model. These estimates give a statistical model greater generalization capacity, so that remotely related family members can be more reliably recognized by the model. This paper corrects the previously published formula for estimating these expected probabilities, and contains complete derivations of the Dirichiet mixture formulas, methods for optimizing the mixtures to match particular databases, and suggestions for efficient implementation.



Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
BiostatisticsHome page
Y. Fong, J. Wakefield, and K. Rice
Bayesian mixture modeling using a hybrid sampler with application to protein subfamily identification
Biostat., January 1, 2010; 11(1): 18 - 33.
[Abstract] [Full Text] [PDF]


Home page
Brief Funct Genomic ProteomicHome page
P. P. Gardner
The use of covariance models to annotate RNAs in whole genomes
Briefings in Functional Genomics, November 1, 2009; 8(6): 444 - 450.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
R. Schwarz, P. N. Seibel, S. Rahmann, C. Schoen, M. Huenerberg, C. Muller-Reible, T. Dandekar, R. Karchin, J. Schultz, and T. Muller
Detecting species-site dependencies in large multiple sequence alignments
Nucleic Acids Res., October 1, 2009; 37(18): 5959 - 5968.
[Abstract] [Full Text] [PDF]


Home page
CSH ProtocolsHome page
D. W. Mount
Using Hidden Markov Models to Align Multiple Sequences
CSH Protocols, July 1, 2009; 2009(7): pdb.top41 - pdb.top41.
[Abstract] [Full Text]


Home page
BioinformaticsHome page
E. L. Peterson, J. Kondev, J. A. Theriot, and R. Phillips
Reduced amino acid alphabets exhibit an improved sensitivity and selectivity in fold assignment
Bioinformatics, June 1, 2009; 25(11): 1356 - 1362.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
A. Biegert and J. Soding
Sequence context-specific profiles for homology searching
PNAS, March 10, 2009; 106(10): 3770 - 3775.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
A. M. Moses and R. Durbin
Inferring Selection on Amino Acid Preference in Protein Domains
Mol. Biol. Evol., March 1, 2009; 26(3): 527 - 536.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
S. F. Altschul, E. M. Gertz, R. Agarwala, A. A. Schaffer, and Y.-K. Yu
PSI-BLAST pseudocounts and the minimum description length principle
Nucleic Acids Res., February 1, 2009; 37(3): 815 - 824.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
S. Katzman, C. Barrett, G. Thiltgen, R. Karchin, and K. Karplus
PREDICT-2ND: a tool for generalized protein local structure prediction
Bioinformatics, November 1, 2008; 24(21): 2453 - 2459.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
D. P. Brown
Efficient functional clustering of protein sequences using the Dirichlet process
Bioinformatics, August 15, 2008; 24(16): 1765 - 1771.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
J. G. Glanville, D. Kirshner, N. Krishnamurthy, and K. Sjolander
Berkeley Phylogenomics Group web servers: resources for structural phylogenomic analysis
Nucleic Acids Res., July 13, 2007; 35(suppl_2): W27 - W32.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
E. K. Freyhult, J. P. Bollback, and P. P. Gardner
Exploring genomic dark matter: A critical assessment of the performance of homology search methods on noncoding RNA
Genome Res., January 1, 2007; 17(1): 117 - 125.
[Abstract] [Full Text] [PDF]


Home page
Protein Eng Des SelHome page
T. Muramatsu and M. Suwa
Statistical analysis and prediction of functional residues effective for GPCR-G-protein coupling selectivity
Protein Eng. Des. Sel., June 1, 2006; 19(6): 277 - 283.
[Abstract] [Full Text] [PDF]


Home page
Cold Spring Harb Symp Quant BiolHome page
S.R. EDDY
Computational Analysis of RNAs
Cold Spring Harb Symp Quant Biol, January 1, 2006; 71(0): 117 - 128.
[Abstract] [PDF]


Home page
BioinformaticsHome page
R. Y. Kahsay, G. Gao, and L. Liao
An improved hidden Markov model for transmembrane protein detection and topology prediction and its applications to complete genomes
Bioinformatics, May 1, 2005; 21(9): 1853 - 1858.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
G. E. Crooks and S. E. Brenner
An alternative model of amino acid replacement
Bioinformatics, April 1, 2005; 21(7): 975 - 980.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
M. N. Price, K. H. Huang, E. J. Alm, and A. P. Arkin
A novel method for accurate operon predictions in all sequenced prokaryotes
Nucleic Acids Res., February 8, 2005; 33(3): 880 - 892.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
E. P. Xing and R. M. Karp
MotifPrototyper: A Bayesian profile model for motif families
PNAS, July 20, 2004; 101(29): 10523 - 10528.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
A. Y. Lau and D. I. Chasman
Functional classification of proteins and protein variants
PNAS, April 27, 2004; 101(17): 6576 - 6581.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
N. Hulo, C. J. A. Sigrist, V. Le Saux, P. S. Langendijk-Genevaux, L. Bordoli, A. Gattiker, E. De Castro, P. Bucher, and A. Bairoch
Recent improvements to the PROSITE database
Nucleic Acids Res., January 1, 2004; 32(90001): D134 - 137.
[Abstract] [Full Text] [PDF]


Home page
J. Biol. Chem.Home page
R. S. Williams, D. I. Chasman, D. D. Hau, B. Hui, A. Y. Lau, and J. N. M. Glover
Detection of Protein Folding Defects Caused by BRCA1-BRCT Truncation and Missense Mutations
J. Biol. Chem., December 26, 2003; 278(52): 53007 - 53016.
[Abstract] [Full Text] [PDF]


Home page
Protein Eng Des SelHome page
D. Kim, D. Xu, J.-t. Guo, K. Ellrott, and Y. Xu
PROSPECT II: protein structure prediction program for genome-scale applications
Protein Eng. Des. Sel., September 1, 2003; 16(9): 641 - 650.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
P. D. Thomas, M. J. Campbell, A. Kejariwal, H. Mi, B. Karlak, R. Daverman, K. Diemer, A. Muruganujan, and A. Narechania
PANTHER: A Library of Protein Families and Subfamilies Indexed by Function
Genome Res., September 1, 2003; 13(9): 2129 - 2141.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
P. C. Ng and S. Henikoff
Predicting Deleterious Amino Acid Substitutions
Genome Res., May 1, 2001; 11(5): 863 - 874.
[Abstract] [Full Text]


Home page
Physiol. GenomicsHome page
E. J. Moler, D. C. Radisky, and I. S. Mian
Integrating naive Bayes models and external knowledge to examine copper and iron homeostasis in S. cerevisiae
Physiol Genomics, December 18, 2000; 4(2): 127 - 135.
[Abstract] [Full Text] [PDF]


Home page
Protein Eng Des SelHome page
S. R. Sunyaev, F. Eisenhaber, I. V. Rodchenkov, B. Eisenhaber, V. G. Tumanyan, and E. N. Kuznetsov
PSIC: profile extraction from sequence alignments with position-specific counts of independent observations
Protein Eng. Des. Sel., May 1, 1999; 12(5): 387 - 394.
[Abstract] [Full Text] [PDF]



Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.