Bioinformatics Advance Access originally published online on June 30, 2005
Bioinformatics 2005 21(17):3482-3489; doi:10.1093/bioinformatics/bti564
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
The inference of proteinprotein interactions by co-evolutionary analysis is improved by excluding the information about the phylogenetic relationships
1Bioinformatics Center, Institute for Chemical Research, Kyoto University Gokasho, Uji, Kyoto 611-0011, Japan
2Centre de Géostatistique Ecole des Mines de Paris, 35 rue Saint-Honoré, 77305 Fontainebleau cedex, France
3Division of Bioinformatics, Medical Institute of Bioregulation, Kyushu University Fukuoka, Fukuoka 812-8582, Japan
*To whom correspondence should be addressed.
| Abstract |
|---|
|
|
|---|
Motivation: The prediction of proteinprotein interactions is currently an important issue in bioinformatics. The mirror tree method uses evolutionary information to predict proteinprotein interactions. However, it has been recognized that predictions by the mirror tree method lead to many false positives. The incentive of our study was to solve this problem by improving the method of extracting the co-evolutionary information regarding the protein pairs.
Results: We developed a novel method to predict proteinprotein interactions from co-evolutionary information in the framework of the mirror tree method. The originality is the use of the projection operator to exclude the information about the phylogenetic relationships among the source organisms from the distance matrix. Each distance matrix was transformed into a vector for the operation. The vector is referred to as a phylogenetic vector. We have proposed three ways to extract the phylogenetic information: (1) using the 16S rRNA from the same source organisms as the proteins under consideration, (2) averaging the phylogenetic vectors and (3) analyzing the principal components of the phylogenetic vectors. We examined the performance of the proposed methods to predict interacting protein pairs from Escherichia coli, using experimentally verified data. Our method was successful, and it drastically reduced the number of false positives in the prediction.
Availability: The R script for the prediction of proteinprotein interactions reported in this manuscript is available at http://timpani.genome.ad.jp/~proj/
Contact: sato{at}kuicr.kyoto-u.ac.jp
Supplementary information: The information is also available at the same site as the R script.
| 1 INTRODUCTION |
|---|
|
|
|---|
Information about proteinprotein interactions in living cells provides deep insight into the biological functions of proteins and the behavior of cells. Genome-wide experimental analyses, such as the yeast 2-hybrid system (Ito et al., 2001; Uetz et al., 2000) and mass spectrometry (Gavin et al., 2002; Ho et al., 2002), have facilitated exhaustive investigations of proteinprotein interactions in cells. However, such experimental methods have coverage and accuracy problems (Sprinzak et al., 2003; von Mering et al., 2002). Currently, the prediction of proteinprotein interactions has become one of the major issues in bioinformatics. The predicted proteinprotein interactions can provide complementary or supporting evidence to the genome-wide experimental studies on proteinprotein interactions eventhough computational analyses also suffer from the same problems as experimental studies, such as low coverage and low accuracy.
Various methods to predict proteinprotein interactions have been developed. One of these methods is the prediction through genome comparisons, which includes phylogenetic profile (Pellegrini et al., 1999), Rosetta stone (Enright et al., 1999) and conserved gene neighborhood analyses (Dandekar et al., 1998). Prediction by using information about the co-occurrence of domains in proteinprotein interactions is another approach. Co-evolutionary behavior between interacting proteins is also useful information for predictions. There are two representative prediction methods that utilize co-evolutionary information, the mirror tree method (Pazos and Valencia, 2001) and the in silico 2-hybrid system method (Pazos and Valencia, 2002). In this paper, we focus on the mirror tree method.
Although there are several preceding works, such as Goh et al. (2000) the mirror tree method was developed by Pazos and Valencia (2001). The mirror tree method predicts proteinprotein interactions under the assumption that the interacting proteins show similarity in the molecular phylogenetic tree because of the co-evolution through the interaction. However, it is difficult to evaluate the similarity directly between a pair of molecular phylogenetic trees. Instead, the mirror tree method compares a pair of distance matrices in order to evaluate the extent of co-evolutionary behavior between two proteins. We will explain the method briefly. Consider two proteins, say, proteins A and B. The orthologues of protein A are collected from n species. The n sequences of protein A are aligned and the distance matrix, DA, is calculated. The size of DA is n x n, and each row or column of the matrix corresponds to a species under consideration. An element of the matrix, DA(i, j), represents the genetic distance between species i and j, which is calculated by comparing the amino acid sequences of protein A between the two species. A distance matrix is symmetric, and only the upper or lower half of the matrix includes sufficient information for the prediction. Likewise, the orthologues of protein B are collected from the same n species, and the distance matrix, DB, is calculated. The intensity of the co-evolutionary constraint between proteins A and B is evaluated as Pearson's correlation coefficient,
, between the distance matrices DA and DB, which is calculated as follows:
![]() | (1) |
One of the problems of the mirror tree method is the large number of false positives in the prediction. Even protein pairs that are known not to interact often show high correlation coefficients. The abundance of false positives in the mirror tree prediction reduces the reliability of the method in actual applications. The distance matrices of orthologous proteins from the same set of n source organisms are compared in the mirror tree method. Therefore, all of the distance matrices of the proteins are considered to include the information about the phylogenetic relationships among the same n sources, to some extent. The phylogenetic relationships among the identical set of sources behind the distance matrices would be the cause for such a high correlation between non-interacting proteins. If we can exclude the information about the phylogenetic relationships from the distance matrices then the performance of the mirror tree method may be improved.
In our method, we used a projection operator to exclude the information about the phylogenetic relationships of the sources, and then the residual information after this operation was used for the calculation of the correlation coefficient between proteins. The projection operator is a linear transformation in a vector space. A point in the vector space is projected to a subspace so that the difference vector between the original point and the image in the subspace is orthogonal to the subspace. The projection operator is widely used in various fields, such as multivariate analysis and quantum mechanics. One of the well-known examples of the use of the projection operator is spectral resolution. We applied our method to physically contacting proteins, to evaluate its performance. That is, in this manuscript a proteinprotein interaction means physical contact. As discussed below, our method succeeded in drastically reducing the number of false positives in the predicted proteinprotein interactions. The quality of the data needed to realize a correct prediction was also examined. We also found that the inclusion of distantly related orthologues in the data improves the performance. The benefits and limitations of our approach are discussed based on our observations.
| 2 METHODS |
|---|
|
|
|---|
The method developed by us is outlined in Figure 1.
|
2.1 Data preparation
We selected 13 pairs of Escherichia coli proteins that are physically in contact, from the Database of Interacting Proteins (DIP) Version 01/02/2005 (Salwinski et al., 2004). The selected pairs are described in the legend for Table 1. Each pair was selected so that neither of the interacting proteins participated in the remaining 12 pairs of interacting proteins. Then, putative orthologues corresponding to the 26 proteins derived from E.coli were collected from 40 different bacterial species, according to the description in the KEGG/KO database (Kanehisa et al., 2004). The sources are shown in the Supplemental Figure S1. Hereafter, the set of putative orthologues from the 41 bacterial sources is simply referred to as the orthologues. One of the important assumptions in this study is that a pair of proteins, which are orthologous to the interacting proteins of E.coli, are also physically in contact. The other assumption is that the interaction affects the co-evolution of the orthologues.
|
A multiple alignment of each set of orthologous proteins was made with the alignment software MAFFT (Katoh et al., 2005). A distance matrix for the orthologues was calculated from the multiple alignment. Then, a genetic distance between every pair of aligned sequences was calculated as a maximum likelihood estimate using the PROTDIST module in the PHYLIP package (Felsenstein, 2004). The score table by Jones et al. (1992) was used for the maximum likelihood estimation. A distance matrix for a set of orthologues was constructed with the genetic distances.
2.2 Transformation from distance matrix to phylogenetic vector
The distance matrix was transformed into a vector for easier formulation. The upper or lower half of the non-diagonal elements of the distance matrix was arranged as an array of the numerical values in a certain order. All of the matrices were transformed into vectors with the same order of the elements. When the matrix has a size of n x n the dimension of the vector is n(n 1)/2. The vector is hereafter referred to as a phylogenetic vector. In this study, n is equal to 41. Therefore, the dimension of the phylogenetic vector is 820. Let us consider a pair of phylogenetic vectors |
i
and |
j
, which are transformed from distance matrices Di and Dj, where the subscripts i and j indicate different sets of orthologues. Then, we apply the normalization of the elements of each vector with the average and the standard deviation of the elements as follows:
![]() |
is a vector with the same dimension as |
i
. All the elements of |µ
are constant, and are equivalent to the arithmetic average over the elements of |
i
. Var(
i) indicates the variance over all the elements of |
i
. The superscript
in
indicates that the vector is normalized. Then, the inner product of a pair of normalized vectors is reduced to the Pearson's correlation coefficient used for the mirror tree method, which is defined by formula (1). Hereafter, the correlation coefficient will be denotedas
.
![]() |
2.3 Projection operator
Consider a unit vector |u
, which represents the phylogenetic relationship of the species under consideration. If such a vector is obtained, then the following projection operator P can be defined as
![]() | (2) |
u| is also a projection operator onto the direction of the unit vector |u
. The projection operator is a matrix with the size of n(n 1)/2 x n(n 1)/2. The method to obtain |u
is explained below. I represents an identity matrix with the same size as |u
u|. By applying the projection operator (2) to a phylogenetic vector, say, |
i
, the component within |
i
, which is orthogonal to |u
, is obtained as follows:
![]() | (3) |
and
. Then, the inner product of the two vectors
![]() |
is a new measure to evaluate the co-evolutionary behavior between proteins i and j.
2.4 Unit vector in the projection operator
The remaining problem is how to obtain the unit vector |u
representing the phylogenetic relationship of the source organisms. We developed three different methods to design such a unit vector: (1) transformation of the distance matrix of 16S ribosomal RNA (rRNA) from the same source organisms as the proteins under consideration, (2) averaging the phylogenetic vectors and (3) analyzing the principal components of the phylogenetic vectors.
In the first method, 16S rRNA was used for the calculation. Basically, each organism has at least one copy of the 16S rRNA gene. Therefore, the distance matrix or the phylogenetic vector of the 16S rRNAs is considered to represent the phylogenetic relationship among the source organisms. The rRNA sequences from the same sources as the proteins under consideration were collected from the KEGG/GENES database (Kanehisa et al., 2004) and the Ribosomal Database Project-II Release 9 (Gustafson et al., 2005). The rRNA sequences thus collected were aligned, and the distance between every pair of the aligned RNA sequences was calculated by using the F84 scoring table (Kishino and Hasegawa, 1989) and the DNADIST module in the PHYLIP package (Felsenstein, 2004). The distance matrix was then transformed into a phylogenetic vector
indicates the size of the vector. Then, a unit vector |u16S
was obtained as |
16S
/||
16S||.
In the second method, all of the phylogenetic vectors under consideration were normalized so that the standard deviation of the elements in each protein was 1 at first. Then, they were averaged as
![]() |
, was obtained as |
AVE
/||
AVE||.
In the third method, the phylogenetic vectors were used again. Let X be a matrix in which the i-th column corresponds to a phylogenetic vector of protein i, normalized with the average and the standard deviation. The size of X is n(n 1)/2 x m. Then, a correlation coefficient matrix Y was calculated as XTX. The superscript T indicates the transpose of a matrix. Therefore, the size of Y is m x m. The principal component analysis for the data corresponding to X was carried out by solving the eigenvalue problem of Y. Then, |
PC1
was obtained as |
PC1
= X|z1
, where |z1
is the first principal component axis associated with the largest eigenvalue for the correlation coefficient matrix. |
PC1
thus obtained is expected to represent the most common features of the m phylogenetic vectors. Then, |
PC1
/||
PC1|| generated the third unit vector, |uPC1
.
In the second and third methods it is assumed that the information, except for the phylogenetic relationship of the sources, can be approximately canceled out by the average operation or principal component analysis. The first method requires the presence of 16S rRNA from the same sources as the proteins under consideration, whereas the latter two methods are feasible with only the phylogenetic vectors. The Pearson's correlation coefficients between the residues for two sets of orthologues i and j, which were projected out by the operators constructed with |u16S
, |uAVE
and |uPC1
, were represented by
,
and
. When the subscripts, i and j, are omitted,
* collectively represents the type of correlation coefficient indicated by the superscript.
| 3 RESULTS AND DISCUSSION |
|---|
|
|
|---|
3.1 Prediction of proteinprotein interactions by using
MIRROR,
16S,
AVE and
PC1We calculated four types of correlation coefficients,
MIRROR,
16S,
AVE and
PC1, for all of the possible pairs of 26 proteins, that is, 325 pairs of proteins. The performance of each correlation coefficient was evaluated with the number of false positives. The correlation coefficients, sorted in decreasing order, are listed in the Supplemental Table S1, and only the top 30 members of the lists are shown in Table 1. Out of the 325 pairs, the interactions of 13 pairs have been experimentally identified and are highlighted with asterisks in the table. The top ranks of
16S,
AVE and
PC1 were occupied by pairs of actually interacting proteins. In contrast, non-interacting proteins were present within the top ranks of
MIRROR. The decreasing patterns of the four correlation coefficients are shown in Figure 2, which shows that
MIRROR decreased slowly, whereas
AVE and
PC1 decreased rapidly. The rate of the
16S decrease was rather moderate. Both Table 1 and Figure 2 clearly demonstrate the problem of the original mirror tree method. Even if a high value, say 0.9, is used as a threshold for the correlation coefficient to predict a proteinprotein interaction,
MIRROR produces many pairs with high correlation, including non-interacting partners, and is likely to lead to many false positives in the prediction. However, the occupation of the top ranks by interacting proteins and the rapid decreases of
16S,
AVE and
PC1 guarantee the accuracy of prediction by the three correlation coefficients, if the threshold is set at a sufficiently high value.
|
The unit vector |u
seems to be a crucial factor for the prediction of a proteinprotein interaction in the methods with a projection operator. Therefore, we examined the association among |u16S
, |uAVE
and |uPC1
by calculating Pearson's correlation coefficients, which is denoted as r as given below. We considered the absolute value of r because the sign of r does not make sense in this context. |r| between |u16S
and |uAVE
was 0.94697, whereas |r| between |u16S
and |uPC1
was 0.94597. The highest correlation, |r| = 0.99805, was observed between |uAVE
and |uPC1
. The high correlation between |u16S
and the other unit vectors suggests that one of our assumptions described above is correct. The information except for the phylogenetic relationship of sources can be approximately canceled out by the average operation or principal component analysis. The similarity in the patterns of the decreases in the correlation coefficients roughly corresponded to the similarity in the unit vectors. As shown in Figure 2, the two sets of plots of
AVE and
PC1, which were calculated with |uAVE
and |uPC1
, overlapped each other. On the other hand, the plots of
16S, which was related to |u16S
, slightly deviated from the plots of
AVE and
PC1.
The
16S,
AVE and
PC1 analyses seem to outperform the
MIRROR analysis to a large extent. That is, the exclusion of the information about the phylogenetic relation among the source organisms from the distance matrices is effective to remove the false positives from the prediction by the mirror tree method. To investigate how different threshold values affect the accuracy of the prediction we introduced four thresholds for correlation coefficients, 0.9, 0.8, 0.7 and 0.6 (Table 2). The performances of the original mirror tree method and our proposed methods were evaluated with regard to sensitivity and specificity. When a pair of proteins had a correlation coefficient greater than the threshold the proteins were predicted to interact with each other. The advantage of
AVE and
PC1 was the high specificity for any threshold.
16S showed high specificity only for thresholds 0.9 and 0.8. In contrast,
MIRROR showed high sensitivity in all of the cases, except for the threshold = 0.9. The high specificities of
16S,
AVE and
PC1 mean the drastic reduction of false positives, as compared with
MIRROR. We will demonstrate how the number of false positives was reduced by our methods using a concrete example. For instance, we take proteins RpoB and SecY, which do not interact with each other. However, the
MIRROR value of the pair was 0.95463, which occupies the 8th position of the list in Table 1. The same pair is presented at the 15th position in the sorted list of
16S. As for
AVE and
PC1, the corresponding coefficients between the pair were 0.49806 and 0.38340, which are present at the 15th and 27th positions of the lists in Table 1.
|
Despite the improvement described above, the sensitivities of
16S,
AVE and
PC1 were lower than that of
MIRROR. This means that a pair of proteins i and j, which interact with each other, will not always show high
,
or
coefficients. In other words, the number of false negatives increased when our methods were used, as compared with the original mirror tree method. In this study, we calculated the intensity of co-evolution between a pair of proteins as the correlation coefficient after the projection operation. However, the pairs may also interact with other proteins. If such proteins exist, the interaction with the pair would be difficult to detect, because the co-evolution with the other partners would interfere with the detection. To examine this hypothesis, we investigated the relationship between the multiplicity of the interaction and the correlation coefficient. The correlation coefficients, the multiplicities of interacting partners and the ranks in the sorted lists of the 13 pairs of interacting proteins are shown in Table 3. The multiplicity of interacting partners for proteins was evaluated with a modified Jaccard coefficient. The interacting partners were searched from the DIP database (Salwinski et al., 2004). Consider an interacting pair of proteins A and B. Let M and N be the sets of interacting partners of proteins A and B. Therefore, protein B belongs to M, whereas N includes protein A. The Jaccard coefficient is defined as |M
N|/|M
N|, where |M| is the size of the set M or the number of elements in the set. When the proteins A and B share many interacting partners the coefficient shows a value close to 1. However, it takes a low value close to 0 when protein A has many interacting partners which do not interact with protein B and vice versa. The deficiency of the original definition is that the coefficient is 0 when protein A interacts only with protein B. We modified the coefficient so that the coefficient between proteins A and B takes the value 1 when no other proteins interact with the pair. The modified Jaccard coefficient is defined as follows:
![]() |
|
3.2 Assessment based on the ROC curve
The relationships between the true and false positives for the four correlation coefficients were also examined by drawing ROC curves (Fig. 3). As described above, proteins from 41 sources were used in this study. There is a possibility that the selection of source organisms may affect the accuracy of the prediction. In order to make the evaluation robust to the selection of source organisms, we took the following approach. Out of the 41 sources, 20 organisms were randomly selected. Then,
MIRROR,
16S,
AVE and
PC1 for every pair of 26 proteins were calculated using the randomly selected 20 organisms. The procedure was repeated 1000 times. The rates of true and false positives were calculated in each iteration step with 20 different threshold values. Based on the true and false positive rates averaged at each threshold value, ROC curves for
MIRROR,
16S,
AVE and
PC1 were drawn by connecting the points with 2D coordinates consisting of the two averaged rates. As shown in the figure, the ROC curves for
16S,
AVE and
PC1 deviated upward to that of
MIRROR when the rates of false positives were small. However, when the rates of false positives increased the relationship was inverted and the curve of
MIRROR was above those of
16S,
AVE and
PC1. Considering actual applications, we are supposed to select pairs of proteins with high correlation coefficients as candidates for interacting partners. The result of the analysis with the ROC curve, together with the observation of the decreases in the patterns of correlation coefficients, suggests that our method realizes a high true positive rate and a low false positive rate for pairs of proteins showing high correlation. This would be a benefit of our prediction method, even when considering the deficiency of the higher ratio of false negatives than the original mirror tree method.
|
3.3 Prediction accuracy and distance between species
We finally examined how much the prediction accuracy is influenced by the closeness among the source organisms to be used in the data. Following is the procedure for the analysis.
- Randomly select 20 organisms from the 41 source organisms.
- Compute the average of the distances over all possible pairs of 20 organisms, based on the 16S rRNAs.
- Repeat (1) and (2) 10 000 times and generate the distribution of 10 000 average distances.
- Classify the sets into three groups based on the distribution: the first group (upper 5% of the distribution), the second group (lower 5% of the distribution) and the third group (the rest).
Note that the first group consisted of the sets of distantly related organisms, whereas the closely related organisms constituted the second group.
For each group,
MIRROR,
16S,
AVE and
PC1 were calculated, and the corresponding ROC curves were drawn for the three groups from 20 different threshold values (Fig. 4). The rates of the false and true positives calculated at each threshold value were averaged and were then used to draw the ROC curve, as described above. As shown in the figure, the performance of the first group was better than those of the second and third groups, in terms of the false positive rates. This observation suggests that the inclusion of proteins from distantly related sources increases the reliability of the correlation coefficients for the detection of co-evolutionary behavior. The inclusion of distantly related sources would be required to accurately estimate the unit vector |u
used to construct the projection operator.
|
| 4 CONCLUSION |
|---|
|
|
|---|
The mirror tree method is an outstanding approach for the prediction of proteinprotein interactions. The approach with co-evolutionary information has introduced new perspectives into the computational analyses of proteinprotein interactions, which were mainly investigated by comparisons of genomic contexts. In this paper we presented several methods to improve the performance of the original mirror tree method by controlling for the phylogenetic relationships among the sources with the projection operator. In the experiment, we confirmed that our methods could drastically reduce the number of false positives in the prediction. We also showed that the inclusion of proteins from distantly related sources could improve the prediction accuracy.
Our method generated more false negatives than the original mirror tree method. As described above, we speculated that the number of interacting partners could be the reason for the increased number of false negatives. However, if we select protein pairs with a high correlation coefficient, say >0.8, by using our method, then we can predict with high reliability that the protein pair is interacting or is physically in contact.
| Acknowledgments |
|---|
This work was supported by grants from the Ministry of Education, Culture, Sports, Science and Technology, the Japan Society for the Promotion of Science and the Japan Science and Technology Corporation. The computational resource was provided by the Bioinformatics Center, Institute for Chemical Research, Kyoto University.
Conflict of Interest: none declared.
Received on April 23, 2005; revised on June 23, 2005; accepted on June 28, 2005
| REFERENCES |
|---|
|
|
|---|
Dandekar, T., et al. (1998) Conservation of gene order: a fingerprint of proteins that physically interact. Trends Biochem. Sci., 23, 324328[CrossRef][Web of Science][Medline].
Enright, A., et al. (1999) Protein interaction maps for complete genomes based on gene fusion events. Nature, 402, 8690[CrossRef][Medline].
Felsenstein, J. (2004) PHYLIP (Phylogeny Inference Package) version 3.6. Distributed by the author. , Seattle Department of Genome Sciences, University of Washington.
Gavin, A., et al. (2002) Functional organization of the yeast proteome by systematic analysis of protein complexes. Nature, 415, 141147[CrossRef][Medline].
Gertz, J., et al. (2003) Inferring protein interactions from phylogenetic distance matrices. Bioinformatics, 19, 20392045
Goh, C., et al. (2000) Co-evolution of proteins with their interaction partners. J. Mol. Biol., 299, 283293[CrossRef][Web of Science][Medline].
Gustafson, A., et al. (2005) ASRP: the Arabidopsis Small RNA Project Database. Nucleic Acids Res., 33, D637D640
Ho, Y., et al. (2002) Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry. Nature, 415, 180183[CrossRef][Medline].
Ito, T., et al. (2001) A comprehensive two-hybrid analysis to explore the yeast protein interactome. Proc. Natl Acad. Sci. USA, 98, 45694574
Jones, D., et al. (1992) The rapid generation of mutation data matrices from protein sequences. Comput. Appl. Biosci., 8, 275282
Kanehisa, M., et al. (2004) The KEGG resource for deciphering the genome. Nucleic Acids Res., 32, D277D280
Katoh, K., et al. (2005) MAFFT version 5: improvement in accuracy of multiple sequence alignment. Nucleic Acids Res., 33, 511518
Kishino, H. and Hasegawa, M. (1989) Evaluation of the maximum likelihood estimate of the evolutionary tree topologies from DNA sequence data, and the branching order in hominoidea. J. Mol. Evol., 29, 170179[CrossRef][Web of Science][Medline].
Nooren, I.M. and Thornton, J.M. (2003) Diversity of proteinprotein interactions. EMBO J., 22, 34863492[CrossRef][Web of Science][Medline].
Pazos, F. and Valencia, A. (2001) Similarity of phylogenetic trees as indicator of proteinprotein interaction. Protein Eng., 14, 609614
Pazos, F. and Valencia, A. (2002) In silico two-hybrid system for the selection of physically interacting protein pairs. Proteins, 47, 219227[CrossRef][Web of Science][Medline].
Pellegrini, M., et al. (1999) Assigning protein functions by comparative genome analysis: protein phylogenetic profiles. Proc. Natl Acad. Sci. USA, 96, 42854288
Ramani, A. and Marcotte, E.M. (2003) Exploiting the co-evolution of interacting proteins to discover interaction specificity. J. Mol. Biol., 327, 273284[CrossRef][Web of Science][Medline].
Salwinski, L., et al. (2004) The Database of Interacting Proteins: 2004 update. Nucleic Acids Res., 32, D449D451
Sprinzak, E., et al. (2003) How reliable are experimental proteinprotein interaction data? J. Mol. Biol., 327, 919923[CrossRef][Web of Science][Medline].
Tan, S., et al. (2004) ADVICE: Automated Detection and Validation of Interaction by Co-Evolution. Nucleic Acids Res., 32, W69W72
Uetz, P., et al. (2000) A comprehensive analysis of proteinprotein interactions in Saccharomyces cerevisiae. Nature, 403, 623627[CrossRef][Medline].
von Mering, C., et al. (2002) Comparative assessment of large-scale data sets of proteinprotein interactions. Nature, 417, 399403[Medline].
This article has been cited by other articles:
![]() |
E. R.M. Tillier and R. L. Charlebois The human protein coevolution network Genome Res., October 1, 2009; 19(10): 1861 - 1871. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. R Kensche, V. van Noort, B. E Dutilh, and M. A Huynen Practical and theoretical advances in predicting the function of a protein by its phylogenetic distribution J R Soc Interface, February 6, 2008; 5(19): 151 - 170. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Juan, F. Pazos, and A. Valencia High-confidence prediction of global interactomes based on genome-wide coevolutionary networks PNAS, January 22, 2008; 105(3): 934 - 939. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. G. Kann Protein interactions and disease: computational approaches to uncover the etiology of diseases Brief Bioinform, September 1, 2007; 8(5): 333 - 346. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Hakes, S. C. Lovell, S. G. Oliver, and D. L. Robertson Specificity in protein interactions and its relationship with sequence diversity and coevolution PNAS, May 8, 2007; 104(19): 7999 - 8004. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Sato, Y. Yamanishi, K. Horimoto, M. Kanehisa, and H. Toh Partial correlation coefficient between distance matrices as a new indicator of protein-protein interactions Bioinformatics, October 15, 2006; 22(20): 2488 - 2492. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. M. G. Izarzugaza, D. Juan, C. Pons, J. A. G. Ranea, A. Valencia, and F. Pazos TSEMA: interactive prediction of protein pairings between interacting families. Nucleic Acids Res., July 1, 2006; 34(Web Server issue): W315 - W319. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||


















