Skip Navigation

This Article
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow FREE Full Text (Screen PDF)
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (93)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Pei, J.
Right arrow Articles by Grishin, N. V.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Pei, J.
Right arrow Articles by Grishin, N. V.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Bioinformatics Vol. 17 no. 8 2001
Pages 700-712
© 2001 Oxford University Press

AL2CO: calculation of positional conservation in a protein sequence alignment

Jimin Pei 2 and Nick V. Grishin 1,2,*

1 Howard Hughes Medical Institute
2 Department of Biochemistry, University of Texas Southwestern Medical Center, 5323 Harry Hines Blvd, Dallas, TX 75390-9050, USA

Received on September 12, 2000 ; revised on February 23, 2001 ; accepted on February 28, 2001

Motivation: Amino acid sequence alignments are widely used in the analysis of protein structure, function and evolutionary relationships. Proteins within a superfamily usually share the same fold and possess related functions. These structural and functional constraints are reflected in the alignment conservation patterns. Positions of functional and/or structural importance tend to be more conserved. Conserved positions are usually clustered in distinct motifs surrounded by sequence segments of low conservation. Poorly conserved regions might also arise from the imperfections in multiple alignment algorithms and thus indicate possible alignment errors. Quantification of conservation by attributing a conservation index to each aligned position makes motif detection more convenient. Mapping these conservation indices onto a protein spatial structure helps to visualize spatial conservation features of the molecule and to predict functionally and/or structurally important sites. Analysis of conservation indices could be a useful tool in detection of potentially misaligned regions and will aid in improvement of multiple alignments.

Results: We developed a program to calculate a conservation index at each position in a multiple sequence alignment using several methods. Namely, amino acid frequencies at each position are estimated and the conservation index is calculated from these frequencies. We utilize both unweighted frequencies and frequencies weighted using two different strategies. Three conceptually different approaches (entropy-based, variance-based and matrix score-based) are implemented in the algorithm to define the conservation index. Calculating conservation indices for 35522 positions in 284 alignments from SMART database we demonstrate that different methods result in highly correlated (correlation coefficient more than 0.85) conservation indices. Conservation indices show statistically significant correlation between sequentially adjacent positions and , where , and averaging of the indices over the window of three positions is optimal for motif detection. Positions with gaps display substantially lower conservation properties. We compare conservation properties of the SMART alignments or FSSP structural alignments to those of the ClustalW alignments. The results suggest that conservation indices should be a valuable tool of alignment quality assessment and might be used as an objective function for refinement of multiple alignments.

Availability: The C code of the AL2CO program and its pre-compiled versions for several platforms as well as the details of the analysis are freely available at ftp://iole.swmed.edu/pub/al2co/.

Contact: grishin{at}chop.swmed.edu

* To whom correspondence should be addressed.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
BioinformaticsHome page
C. Blouin, S. Perry, A. Lavell, E. Susko, and A. J. Roger
Reproducing the manual annotation of multiple sequence alignments using a SVM classifier
Bioinformatics, December 1, 2009; 25(23): 3093 - 3098.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
M. Samuels, G. Gulati, J.-H. Shin, R. Opara, E. McSweeney, M. Sekedat, S. Long, Z. Kelman, and D. Jeruzalmi
A biochemically active MCM-like helicase in Bacillus cereus
Nucleic Acids Res., July 1, 2009; 37(13): 4441 - 4452.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
Y. Wang, R. I. Sadreyev, and N. V. Grishin
PROCAIN: protein profile comparison with assisting information
Nucleic Acids Res., June 1, 2009; 37(11): 3522 - 3530.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
A. A. Larrea, I. M. Pedroso, A. Malhotra, and R. S. Myers
Identification of two conserved aspartic acid residues required for DNA digestion by a novel thermophilic Exonuclease VII in Thermotoga maritima
Nucleic Acids Res., October 1, 2008; 36(18): 5992 - 6003.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
V. Ahola, T. Aittokallio, M. Vihinen, and E. Uusipaikka
Model-based prediction of sequence alignment quality
Bioinformatics, October 1, 2008; 24(19): 2165 - 2171.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
J. Pei, M. Tang, and N. V. Grishin
PROMALS3D web server for accurate multiple protein sequence and structure alignments
Nucleic Acids Res., July 1, 2008; 36(suppl_2): W30 - W34.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
D. Aguilar, L. Skrabanek, S. S. Gross, B. Oliva, and F. Campagne
Beyond tissueInfo: functional prediction using tissue expression profile similarity searches
Nucleic Acids Res., June 1, 2008; 36(11): 3728 - 3737.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
Y. Park and V. Helms
Prediction of the translocon-mediated membrane insertion free energies of protein sequences
Bioinformatics, May 15, 2008; 24(10): 1271 - 1277.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
J. D. Fischer, C. E. Mayer, and J. Soding
Prediction of protein functional residues from sequence by probability density estimation
Bioinformatics, March 1, 2008; 24(5): 613 - 620.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
H.-W. Wang, J. Wang, F. Ding, K. Callahan, M. A. Bratkowski, J. S. Butler, E. Nogales, and A. Ke
Architecture of the yeast Rrp44 exosome complex suggests routes of RNA recruitment for 3' end processing
PNAS, October 23, 2007; 104(43): 16844 - 16849.
[Abstract] [Full Text] [PDF]


Home page
Drug Metab. Dispos.Home page
K. Klein, S. Tatzel, S. Raimundo, T. Saussele, E. Hustert, J. Pleiss, M. Eichelbaum, and U. M. Zanger
A Natural Variant of the Heme-Binding Signature (R441C) Resulting in Complete Loss of Function of CYP2D6
Drug Metab. Dispos., August 1, 2007; 35(8): 1247 - 1250.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
J. Pei, B.-H. Kim, M. Tang, and N. V. Grishin
PROMALS web server for accurate multiple protein sequence alignments
Nucleic Acids Res., July 13, 2007; 35(suppl_2): W649 - W652.
[Abstract] [Full Text] [PDF]


Home page
J. Immunol.Home page
B. J. Shenker, M. Dlakic, L. P. Walker, D. Besack, E. Jaffe, E. LaBelle, and K. Boesze-Battaglia
A Novel Mode of Action for a Microbial-Derived Immunotoxin: The Cytolethal Distending Toxin Subunit B Exhibits Phosphatidylinositol 3,4,5-Triphosphate Phosphatase Activity
J. Immunol., April 15, 2007; 178(8): 5099 - 5108.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
J. Pei and N. V. Grishin
PROMALS: towards accurate multiple sequence alignments of distantly related proteins
Bioinformatics, April 1, 2007; 23(7): 802 - 808.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
Y. Park and V. Helms
On the derivation of propensity scales for predicting exposed transmembrane residues of helical membrane proteins
Bioinformatics, March 15, 2007; 23(6): 701 - 708.
[Abstract] [Full Text] [PDF]


Home page
Protein Eng Des SelHome page
J. Thusberg and M. Vihinen
The structural basis of hyper IgM deficiency - CD40L mutations
Protein Eng. Des. Sel., March 1, 2007; 20(3): 133 - 141.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
Y.-Y. Huang, J.-Y. Deng, J. Gu, Z.-P. Zhang, A. Maxwell, L.-J. Bi, Y.-Y. Chen, Y.-F. Zhou, Z.-N. Yu, and X.-E. Zhang
The key DNA-binding residues in the C-terminal domain of Mycobacterium tuberculosis DNA gyrase A subunit (GyrA)
Nucleic Acids Res., November 14, 2006; 34(19): 5650 - 5659.
[Abstract] [Full Text] [PDF]


Home page
Syst BiolHome page
M. M. McMahon and M. J. Sanderson
Phylogenetic Supermatrix Analysis of GenBank Sequences from 2228 Papilionoid Legumes
Syst Biol, October 1, 2006; 55(5): 818 - 836.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
B. Song, J.-H. Choi, G. Chen, J. Szymanski, G.-Q. Zhang, A. K. H. Tung, J. Kang, S. Kim, and J. Yang
ARCS: an aggregated related column scoring scheme for aligned sequences
Bioinformatics, October 1, 2006; 22(19): 2326 - 2332.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
G. Wainreb, N. Haspel, H. J. Wolfson, and R. Nussinov
A permissive secondary structure-guided superposition tool for clustering of protein fragments toward protein structure prediction via fragment assembly
Bioinformatics, June 1, 2006; 22(11): 1343 - 1352.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
J. Pei, W. Cai, L. N. Kinch, and N. V. Grishin
Prediction of functional specificity determinants from protein sequences using log-likelihood ratios
Bioinformatics, January 15, 2006; 22(2): 164 - 171.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
T. Lassmann and E. L. L. Sonnhammer
Automatic assessment of alignment quality
Nucleic Acids Res., December 16, 2005; 33(22): 7120 - 7128.
[Abstract] [Full Text] [PDF]


Home page
J. Bacteriol.Home page
S. A. Douthit, M. Dlakic, D. E. Ohman, and M. J. Franklin
Epimerase Active Domain of Pseudomonas aeruginosa AlgG, a Protein That Contains a Right-Handed {beta}-Helix
J. Bacteriol., July 1, 2005; 187(13): 4573 - 4583.
[Abstract] [Full Text] [PDF]


Home page
EndocrinologyHome page
F. G. Riepe, S. Tatzel, W. G. Sippell, J. Pleiss, and N. Krone
Congenital Adrenal Hyperplasia: The Molecular Basis of 21-Hydroxylase Deficiency in H-2aw18 Mice
Endocrinology, June 1, 2005; 146(6): 2563 - 2574.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
R. G. Beiko, C. X. Chan, and M. A. Ragan
A word-oriented approach to alignment validation
Bioinformatics, May 15, 2005; 21(10): 2230 - 2239.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
J. Soding
Protein homology detection by HMM-HMM comparison
Bioinformatics, April 1, 2005; 21(7): 951 - 960.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
S. Balasubramanian, Y. Xia, E. Freinkman, and M. Gerstein
Sequence variation in G-protein-coupled receptors: analysis of single nucleotide polymorphisms
Nucleic Acids Res., March 22, 2005; 33(5): 1710 - 1721.
[Abstract] [Full Text] [PDF]


Home page
RNAHome page
M. DLAKIC
3D models of yeast RNase P/MRP proteins Rpp1p and Pop3p
RNA, February 1, 2005; 11(2): 123 - 127.
[Abstract] [Full Text] [PDF]


Home page
RNAHome page
A. FATICA, D. TOLLERVEY, and M. DLAKIC
PIN domain of Nob1p is required for D-site cleavage in 20S pre-rRNA
RNA, November 18, 2004; 10(11): 1698 - 1701.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
R. Kurzbauer, D. Teis, M. E. G. de Araujo, S. Maurer-Stroh, F. Eisenhaber, G. P. Bourenkov, H. D. Bartunik, M. Hekman, U. R. Rapp, L. A. Huber, et al.
Crystal structure of the p14/MP1 scaffolding complex: How a twin couple attaches mitogen-activated protein kinase signaling to late endosomes
PNAS, July 27, 2004; 101(30): 10984 - 10989.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
J. D. Thompson, V. Prigent, and O. Poch
LEON: multiple aLignment Evaluation Of Neighbours
Nucleic Acids Res., February 24, 2004; 32(4): 1298 - 1307.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
J. Espadaler, N. Fernandez-Fuentes, A. Hermoso, E. Querol, F. X. Aviles, M. J. E. Sternberg, and B. Oliva
ArchDB: automated protein loop classification as a tool for structural genomics
Nucleic Acids Res., January 1, 2004; 32(90001): D185 - 188.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
J. Pei, N. V. Dokholyan, E. I. Shakhnovich, and N. V. Grishin
Using protein design for homology detection and active site searches
PNAS, September 30, 2003; 100(20): 11361 - 11366.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
I. Letunic, L. Goodstadt, N. J. Dickens, T. Doerks, J. Schultz, R. Mott, F. Ciccarelli, R. R. Copley, C. P. Ponting, and P. Bork
Recent improvements to the SMART domain-based sequence annotation resource
Nucleic Acids Res., January 1, 2002; 30(1): 242 - 244.
[Abstract] [Full Text] [PDF]



Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.