Skip Navigation


Bioinformatics Advance Access originally published online on September 27, 2006
Bioinformatics 2006 22(22):2821-2822; doi:10.1093/bioinformatics/btl493
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow All Versions of this Article:
22/22/2821    most recent
btl493v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (9)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Fares, M. A.
Right arrow Articles by McNally, D.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Fares, M. A.
Right arrow Articles by McNally, D.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2006. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oxfordjournals.org

CAPS: coevolution analysis using protein sequences

Mario A. Fares * and David McNally

Evolutionary Genetics and Bioinformatics Laboratory, Department of Genetics Smurfit Institute of Genetics, University of Dublin, Trinity College, Dublin 2, Dublin, Ireland

*To whom correspondence should be addressed.


    ABSTRACT
 TOP
 ABSTRACT
 INTRODUCTION
 REFERENCES
 

Summary: Coevolution Analysis using Protein Sequences (CAPS) is a PERL based software that identifies co-evolution between amino acid sites. Blosum-corrected amino acid distances are used to identify amino acid co-variation. The phylogenetic sequence relationships are used to remove the phylogenetic and stochastic dependencies between sites. The 3D protein structure is used to identify the nature of the dependencies between co-evolving amino acid sites. Friendly interpretable output files are generated.

Availability: CAPS version 1 is available at http://bioinf.gen.tcd.ie/~faresm/software/caps/. Distribution versions for Linux/Unix, Mac OS X and Windows operating systems are available, including manual and example files.

Contact: faresm{at}tcd.ie


    INTRODUCTION
 TOP
 ABSTRACT
 INTRODUCTION
 REFERENCES
 
Proteins are linearly synthesised in the cytosol of the cell and they normally go through complex folding processes to acquire their final productive conformation. These folding processes make possible the 3D proximity between sites that are distinct in the sequence. Amino acid sites functionally/structurally linked to other regions in the protein will be subjected to stronger selective constraints because of the dramatic effects that changes at these sites on the nearby regions of the protein. The evolution of amino acid sites is hence multi-factorial depending on their intrinsic mutation rates and the constraints imposed by their complex co-evolutionary networks (Fares, 2006). This brings into question the use of one codon site as the unit of selection as has been previously shown (Hughes and Nei, 1989; Marin. et al., 2001).

Co-evolution between amino acid sites can be detected using non-parametric (e.g. Korber et al., 1993; Tillier et al., 2006) as well as parametric methods (e.g. Fares and Travers, 2006; Pollock et al., 1999). When a phylogenetic tree and a 3D protein structure are provided, distinguishing functional co-evolution from phylogenetic and stochastic co-variation becomes more approachable (Fares and Travers, 2006). CAPS provides a mathematically simple and computationally feasible way to uncover the co-evolutionary networks between amino acid sites within a protein (Fig. 1). Briefly, CAPS identifies co-evolving amino acid site pairs (e and k) by measuring the correlated evolutionary variation at these sites. Evolutionary variation is measured using time-corrected Blosum values for the transition between two amino acids at a particular site when comparing sequence i to sequence j at sites e and k ({theta}ek)ij. The transition between two amino acids at each site (sites e and k) is corrected by the divergence time of the sequences (taxa) i and j. The time is estimated as the mean number of substitutions per synonymous site between the two sequences being compared (Fares and Travers, 2006). Correlation of the mean variability is measured using the Pearson coefficient. Finally, the significance of the correlation coefficients is estimated by comparing the real correlation coefficients to the distribution of re-sampled correlation coefficients. Only co-evolving sites parsimony informative (presenting significant variability) are considered. Further, a step-down permutational procedure is applied to correct for multiple testing and non-independence of data (Westfall and Young, 1993).


Figure 1
View larger version (23K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Fig. 1 Flow of information through the software CAPS. Input files for CAPS include the protein-coding or amino acid multiple sequence alignment, the 3D protein structure as described in the PDB file and user-defined clades in the phylogeny for that protein. Clades are defined based on their biological relevance or the support values for the internal nodes (e.g. bootstrap support values >75%). CAPS identifies significant co-evolving pairs of amino acid sites by comparing the correlation coefficients generated from the analysis of the sequence alignment to a distribution of correlation coefficients from pseudo-randomly sampled pairs of sites. The step-down permutational procedure is applied to account for multiple tests and non-independent data.

 
A sub-program named CladesCAPS.pl that runs CAPS after eliminating the user-specified phylogenetic clades removes the phylogenetic co-evolution. Finally, output files including the final set of functionally/structurally important sites are generated. When the crystal protein structure is available, CAPS also tests the significance of the distance between the amino acid sites identified as co-evolving, providing useful information about the type of co-evolution (e.g. functional or structural co-evolution).

In addition to the implementation of the method previously published (Fares and Travers, 2006), CAPS also performs a preliminary analysis of compensatory mutations by testing the correlation in the hydrophobicity as well as in the molecular weight variations between co-evolving amino acids. Inter-protein co-evolution, in addition to the intra-molecular co-evolution analysis developed previously (Fares and Travers, 2006), is also an option in CAPS.

The emphasis in CAPS has been centred on four main points: sensitivity of the co-evolutionary analyses, automatic performance, accessibility and ability to compute highly populated multiple sequence alignments. A protein-coding or amino acid multiple sequence alignment is required in one of the standard formats used in other programs (PHYLIP, MEGA or FASTA). The program generates an output file that summarises the results of co-evolution, including a table with all the parameters estimated. Several Excel readable files are also generated for an easier interpretation of the results. For each co-evolving pair of sites, the site location in the reference sequence for which the 3D structure is available is provided together with the site location in the alignment. Correlation of hydrophobicities and molecular weights for the pairs of co-evolving sites are also provided.

The performance of the algorithm together with the sensitivity of the method has been examined in several proteins (Fares and Travers, 2006). Although no limit in the length of the sequences is required, long and populated multiple sequence alignments (e.g. multiple sequence alignments containing 20 sequences or more) provide very accurate results. A limitation of CAPS is that the method does not account for recombination. We will upgrade CAPS in further versions to include other analyses such as the more exhaustive identification of compensatory mutations (conditional advantageous mutations) and the prediction of protein–protein interfaces.


    Acknowledgments
 
We would like to thank beta testers of CAPS for identifying bugs. We are especially thankful to Dr David Posada for helpful comments and algorithm suggestions for CAPS. This work was supported by Science Foundation Ireland.

Conflict of Interest: none declared


    FOOTNOTES
 
Associate Editor: Alfonso Valencia

Received on August 11, 2006; revised on September 12, 2006; accepted on September 19, 2006

    REFERENCES
 TOP
 ABSTRACT
 INTRODUCTION
 REFERENCES
 

    Fares, M.A. (2006) Computational and statistical methods to detect the various dimensions of protein evolution. Curr. Bioinform, . 1, 207–217.

    Fares, M.A. and Travers, S.A. (2006) A novel method for detecting intramolecular coevolution: adding a further dimension to selective constraints analyses. Genetics, 173, 9–23[Abstract/Free Full Text].

    Hughes, A.L. and Nei, M. (1989) Nucleotide substitution at major histocompatibility complex class II loci: evidence for overdominant selection. Proc. Natl Acad. Sci. USA, 86, 958–962[Abstract/Free Full Text].

    Korber, B.T., et al. (1993) Covariation of mutations in the V3 loop of human immunodeficiency virus type 1 envelope protein: an information theoretic analysis. Proc. Natl Acad. Sci. USA, 90, 7176–7180[Abstract/Free Full Text].

    Marin, I., et al. (2001) Detecting changes in the functional constraints of paralogous genes. J. Mol. Evol, . 52, 17–28[Web of Science][Medline].

    Pollock, D.D., et al. (1999) Coevolving protein residues: maximum likelihood identification and relationship to structure. J. Mol. Biol, . 287, 187–198[CrossRef][Web of Science][Medline].

    Tillier, E.R., et al. (2006) Codep: maximizing co-evolutionary interdependencies to discover interacting proteins. Proteins, 63, 822–831[CrossRef][Web of Science][Medline].

    Westfall, P.H. and Young, S.S. Resampling-Based Multiple Testing, (1993) , New York John Wiley & Sons.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
BioinformaticsHome page
R. Gouveia-Oliveira, F. S. Roque, R. Wernersson, T. Sicheritz-Ponten, P. W. Sackett, A. Molgaard, and A. G. Pedersen
InterMap3D: predicting and visualizing co-evolving protein residues
Bioinformatics, August 1, 2009; 25(15): 1963 - 1965.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
S. A. A. Travers, D. C. Tully, G. P. McCormack, and M. A. Fares
A Study of the Coevolutionary Patterns Operating within the env Gene of the HIV-1 Group M Subtypes
Mol. Biol. Evol., December 1, 2007; 24(12): 2787 - 2801.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
V. Ruano-Rubio and M. A. Fares
Testing the Neutral Fixation of Hetero-Oligomerism in the Archaeal Chaperonin CCT
Mol. Biol. Evol., June 1, 2007; 24(6): 1384 - 1396.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
S. A. A. Travers and M. A. Fares
Functional Coevolutionary Networks of the Hsp70-Hop-Hsp90 System Revealed through Computational Analyses
Mol. Biol. Evol., April 1, 2007; 24(4): 1032 - 1044.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow All Versions of this Article:
22/22/2821    most recent
btl493v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (9)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Fares, M. A.
Right arrow Articles by McNally, D.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Fares, M. A.
Right arrow Articles by McNally, D.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?