Functional evaluation of domain–domain interactions and human protein interaction networks
Department of Computational Biology and Applied Algorithmics, Max Planck Institute for Informatics, Stuhlsatzenhausweg 85, 66123 Saarbrücken, Germany
*To whom correspondence should be addressed.
| ABSTRACT |
|---|
|
|
|---|
Motivation: Large amounts of protein and domain interaction data are being produced by experimental high-throughput techniques and computational approaches. To gain insight into the value of the provided data, we used our new similarity measure based on the Gene Ontology (GO) to evaluate the molecular functions and biological processes of interacting proteins or domains. The applied measure particularly addresses the frequent annotation of proteins or domains with multiple GO terms.
Results: Using our similarity measure, we compare predicted domain–domain and human protein–protein interactions with experimentally derived interactions. The results show that our similarity measure is of significant benefit in quality assessment and confidence ranking of domain and protein networks. We also derive useful confidence score thresholds for dividing domain interaction predictions into subsets of low and high confidence.
Contact: mario.albrecht{at}mpi-inf.mpg.de
Supplementary information: Supplementary data are available at Bioinformatics online.
| 1 INTRODUCTION |
|---|
|
|
|---|
Experimental high-throughput techniques have produced enormous amounts of protein–protein interaction (PPI) data for different species (Sharan and Ideker, 2006). These data can now be mined for new information on the functions and interrelationships of proteins (Bork et al., 2004). In particular, different bioinformatics methods, mainly based on the homology of protein sequences, have supported the large-scale prediction of human protein networks (Brown and Jurisica, 2005; Huang et al., 2004; Lehner and Fraser, 2004; McDermott et al., 2005; Persico et al., 2005; Rhodes et al., 2005). Additionally, manually curated literature data and four large-scale yeast two-hybrid (Y2H) maps have recently become available for the human interactome (Goehler et al., 2004; Rual et al., 2005; Stelzl et al., 2005; Lim et al., 2006; Mishra et al., 2006). However, in contrast to predicted data, the experimental coverage of the human interactome is still low. To predict protein interaction networks, domain–domain interactions (DDIs) are often taken into account (Wojcik and Schachter, 2001; Deng et al., 2002; Liu et al., 2005; Rhodes et al., 2005). For this purpose, different sets of DDIs have been predicted using bioinformatics methods (Ng et al., 2003; Liu et al., 2005; Riley et al., 2005) and supplement experimental DDI sets derived from 3D structure data (Finn et al., 2005; Stein et al., 2005).
The Gene Ontology (GO) consortium provides a standardized vocabulary that is commonly used to annotate genes and their products with biological processes and molecular functions (Gene Ontology Consortium, 2006). This annotation particularly allows for assessing the functional similarity of genes or their products. Resnik (1995) and Lin (1998)s introduced semantic similarity measures for the comparison of single terms in is-a ontologies. Both measures are based on the information content of ontology terms. Building on these semantic similarity measures, several methods for the functional comparison of gene products have been introduced. Lu et al. (2005)s and Lin et al. (2004)s evaluated the usefulness of different features, ranging from expression profiles to functional relationships between genes, for the prediction of PPIs. They concluded that functional similarity based on GO annotation leads to high accuracy in predicting PPIs. Wu et al. (2006b) also introduced new similarity measures between GO terms and proteins. Their measures were used to create a predicted network of PPIs and to evaluate genome-scale data sets. Very recently, Guo et al. (2006) assessed the applicability of GO-based similarity measures to human regulatory pathways. They showed that the functional similarity between two proteins decreases as their distance within the same regulatory pathway increases.
One problem with existing GO-based similarity measures is that they do not account for the frequent annotation of gene products or protein domains with multiple GO terms or that they simply average over all annotations. To address this problem, we use our novel GO similarity measure that explicitly deals with this functional multiplicity (Schlicker et al., 2006). The measure is applied to rank interaction networks and the corresponding prediction methods based on the overall functional similarity of the interacting proteins or domains. The comparison of experimentally derived sets with predicted sets of DDIs using our GO similarity measure results in confidence score (CS) thresholds separating low- and high-confidence subsets of predicted DDIs. In addition, we utilize our measure to analyze experimental and predicted networks of human protein interactions.
| 2 MATERIALS AND METHODS |
|---|
|
|
|---|
2.1 Experimental and predicted data sets
Two experimental sets of DDIs were taken from iPfam (Finn et al., 2005) and the database of 3D interacting domains (3did) (Stein et al., 2005) and compared to three sets of predicted interactions between Pfam-A domains (Finn et al., 2006). The first predicted set is InterDom, a database of putatively interacting domains compiled from different data sources (Ng et al., 2003). The other two sets are taken from two recent publications by Liu, Liu, and Zhao (LLZ) (Liu et al., 2005) and by Riley et al. (domain pair exclusion analysis, DPEA). Their bioinformatics approaches are methodological extensions of an expectation-maximization algorithm first applied to the prediction of domain interactions in (Deng et al., 2002). The DDI prediction methods assign a CS to each DDI and rank the predicted DDIs according to the score. InterDom uses different data sources to infer DDIs and calculates the CS based on the support from each source (Ng et al., 2003). LLZ and DPEA compute maximum-likelihood estimates to derive a CS, and we use the probability
and the log-odds score E as CS from LLZ and DPEA, respectively (Liu et al., 2005; Riley et al., 2005). We used all interactions between Pfam-A domains contained in the data sets, including self-interactions and intra-chain interactions from iPfam and 3did for the analysis. The pfam2go file from the GO web site (http://www.geneontology.org/external2go/pfam2go) contains a mapping of Pfam-A domains to GO terms. This mapping is automatically derived from a manually curated mapping of InterPro entries to GO terms (Camon et al., 2004). An InterPro entry is mapped to a GO term if the term matches the function or the process of the InterPro entry and all proteins of this family are annotated with this GO term (Mulder et al., 2003). The pfam2go mapping (downloaded on 7 July 2005) was used to annotate Pfam-A domains with GO terms using all available GO terms including all evidence codes. Table 1 summarizes the number of domains and DDIs in each data set.
|
In addition to the GO annotation obtained by the pfam2go mapping, we used the generic GO-slim set (http://www.geneontology.org/GO.slims.shtml) to map the annotation of domains to more generic functions and processes (Camon et al., 2003). This allows for the determination of GO categories that occur more frequently than others in the data set. In particular, we were interested in determining an enrichment of domains annotated with protein binding in the experimental data sets compared to the predicted ones.
Furthermore, we analyzed six predicted sets of human PPIs named Bioverse (McDermott et al., 2005), HiMAP (Rhodes et al., 2005), HomoMINT (Persico et al., 2005), Sanger (Lehner and Fraser, 2004), OPHID (Brown and Jurisica, 2005) and POINT (Huang et al., 2004). Additionally, subsets of core interactions with high confidence were derived from Bioverse, HiMAP and Sanger. The Bioverse-core set contains very reliable interactions based on a sequence similarity threshold of at least 80% between human and the homolog of the source species (Yu et al., 2004), HiMAP-core interactions have a large likelihood ratio (Rhodes et al., 2005), and Sanger-core comprises only predictions with the greatest experimental support (Lehner and Fraser, 2004). Additionally, we assembled five consensus sets named ConSetn that consist of protein interactions contained in at least n predicted datasets, with n ranging from 2 to 6.
As experimental data sets, we downloaded the manually curated human protein reference database (HPRD) (Mishra et al., 2006), release of September 13, 2005, and two Y2H maps that we named Vidal (Rual et al., 2005) and Wanker (Stelzl et al., 2005) after the senior authors. We also merged the two Y2H maps into the combined data set Vidal & Wanker. Both Y2H maps became available after the six predicted human networks had been published. Further experimental PPIs were extracted from the published networks of direct and indirect interaction partners for ataxins (ATX) (Lim et al., 2006) and huntingtin (HTT) (Goehler et al., 2004). These networks include Y2H and literature-derived data sets, which we call ATX-/HTT-Y2H and ATX-/HTT-literature, respectively. The set ATX-interologs comprises interactions from the ATX network that have been derived by mapping interologs (Lim et al., 2006), and thus we regard it as another predicted set of PPIs. In order to be able to compare different PPI data sets, the diverse gene and protein accession numbers of the PPI sets were mapped to NCBI Entrez gene identifiers (Maglott et al., 2005). The mapping of Entrez gene identifiers to GO annotations was obtained from NCBI (ftp://ftp.ncbi.nih.gov/gene/DATA/gene2go.gz).
Furthermore, we compiled another set of PPIs using the interacting proteins that underlie iPfam DDIs. This set was annotated from two different sources, that is, with the GO annotation of proteins in the UniProt release 5.4 (IUP-set) and with the GO annotation of the protein domains in the pfam2go file (IPG-set) using Pfam release 17.0 (Finn et al., 2006; Wu et al., 2006a). In case of the IPG-set, only the annotations of the interacting domains were taken into account. We excluded self-interactions from both the IUP-set and the IPG-set, whereas self-interactions were not removed from the other PPI data sets.
2.2 Functional similarity measure
The GO controlled vocabulary consists of three different ontologies: biological process (BP), molecular function (MF), and cellular component. The ontologies are organized as directed acyclic graphs with terms being represented as nodes and parent–child relationships as edges. There are two types of edges: is-a links, indicating that the child is an instance of its parent, and part-of , used if the child is a component of its parent. Each node may have several parents and children.
Our semantic similarity measure is an extension of previous measures by Resnik and Lin (Resnik, 1995; Lin, 1998). As suggested by Resnik, we defined the probability of a term as its relative frequency of occurrence in a set of annotated gene products. The root node of each ontology has the probability 1. We used the GO annotation of all proteins in the UniProt release 5.4 for the calculation of term frequencies. The semantic similarity of two terms is defined as follows:
|
|
This semantic similarity measure for single GO terms can be expanded to a functional similarity measure of gene products. Let g1 and g2 be two gene products annotated with the GO term sets GO1 and GO2 of size N and M, respectively. The similarity matrix S containing all pair-wise similarity values is computed as follows:
|
|
The row vectors and column vectors of matrix S represent the two possible directions of comparing g1 and g2. While the similarity computed from g1 to g2 (rowScore) is defined as the average over the row maxima, the similarity from g2 to g1 (columnScore) is defined as the average over the column maxima:
|
|
The rowScore and the columnScore are always between 0 and 1. Furthermore, we define the functional similarity of two gene products with respect to one ontology as
|
|
We refer to this GOscore as MFscore or BPscore in case of MF or BP, respectively. One important aspect of this score is that it allows for comparing gene products with multiple functions. This property is especially important when comparing GO annotations of domains because they occur in diverse proteins involved in different processes. For more details on our GO similarity measure, see (Schlicker et al., 2006). Since this GO similarity measure requires that either both interacting proteins or both interacting domains are annotated with GO terms, the functional similarity analysis considers only interactions with GO annotations present for both interacting partners (see columns 4 and 5 in Table 1). Therefore, the analyzed interaction data sets differ concerning the BPscore and the MFscore. A self-interaction does not necessarily receive a high GOscore because the definition of this similarity measure takes into account how generic the GO annotation term is. For instance, some self-DDI with the interacting domain annotated with the term binding will receive a low MFscore.
| 3 RESULTS AND DISCUSSION |
|---|
|
|
|---|
3.1 Comparing confidence scores for domain interactions
The predictions of DDIs by InterDom, LLZ and DPEA are compiled from diverse data sources using different bioinformatics methods. To gain insight into the similarity and the quality of the predictions, we compared the predicted sets of DDIs with each other and to the experimentally derived sets iPfam and 3did. The overlap of the sets InterDom, LLZ and DPEA regarding Pfam-A domains as well as regarding their predicted interactions are given in Table 2. Both LLZ and DPEA share many Pfam-A domains and predicted DDIs with InterDom, while the overlap between LLZ and DPEA is much smaller.
|
Figure 1 and Table S1 give an overview of the overlap of the experimental interactions contained in iPfam and 3did and the three sets of predicted interactions InterDom, LLZ and DPEA. Of the DDIs predicted by DPEA, 11.9% are confirmed by iPfam or 3did, whereas only 7.4% and 3.0% of the DDIs predicted by InterDom and LLZ, respectively, are in common with iPfam or 3did. Thus, DPEA appears to be the best of the three prediction methods.
|
Other criteria for prediction quality are the CS and the rank assigned to experimentally observed domain interactions. DDIs contained in iPfam and 3did are assigned top ranks by all three prediction methods (Fig. S1). Surprisingly, further analyses indicate only weak correlations between ranks of different prediction methods (Figs. S2–S4). However, DDIs from iPfam that are predicted by two different computational methods are assigned a good rank by at least one method. This suggests that all methods are able to detect correct domain interactions. Further details on the results are described in the online supplement.
3.2 Background distribution and randomized domain networks
In order to obtain a background distribution of the functional similarities, all Pfam-A domains with available GO annotations were used to compute MFscore and BPscore distributions for all possible domain pairs (Fig. S5). Most of the domain pairs have dissimilar molecular functions, resulting in low MFscores. The overall MFscore mean is about 0.1 and the median is 0. The BPscore is distributed similarly though, in comparison with the MFscore, more domain pairs have a BPscore between 0.1 and 0.2 and fewer domain pairs a BPscore below 0.1. This finding is also reflected by the increased BPscore mean and median of 0.23 and 0.17, respectively. These results indicate that the BPscore should generally be higher than the MFscore.
In our analysis, we also randomized all DDI networks to detect possible bias towards specific functions or processes. This was accomplished by keeping one of the two nodes of the interaction edges fixed while randomly shuffling the other nodes of the edges. The obtained distributions are all very similar and closely resemble the background distribution for BP and MF (Figs. S6 and S7). The distributions of the randomized experimental iPfam and 3did networks contain more DDIs with BPscore below 0.1, but fewer with BPscore between 0.1 and 0.2 in contrast to the predicted data sets. Figures S8 and S9 depict the results of the analysis using GO-slim. The distributions of various GO categories, including protein binding, have only minor differences between the data sets. This data suggests that neither of the domain interaction networks is biased towards specific processes or functions.
3.3 Computing and analyzing GOscore distributions
The BPscore distributions for iPfam and 3did (Fig. 2) show that most experimental DDIs have a very high similarity score exceeding 0.8, which means that the corresponding interacting domains are part of the same process or closely related processes. This is supported by high means of about 0.9 and medians of almost 1. The distributions for the predicted sets InterDom or DPEA look alike. Interestingly, only one-third of the predicted interactions have a BPscore above 0.8. Furthermore, both data sets include a large fraction of interactions with BPscore below 0.4, indicating almost no functional similarity between the domains. The mean is 0.51 for both data sets and the medians 0.39 and 0.41 for InterDom and DPEA, respectively. The LLZ predictions contain substantially fewer interactions with high BPscore, and many more interactions with very low BPscore. This is reflected by the relatively low mean of 0.35 and the median of 0.2.
|
More than 50% of all interactions in iPfam and 3did are self-interactions. The predicted sets InterDom and LLZ contain no self-interactions between Pfam-A domains with GO annotation, whereas DPEA contains self-interactions. Therefore, we calculated the BPscore distributions without using self-interacting domains (Fig. S10). The resultant distributions are very similar to the distributions obtained using all available domain interactions. The BPscore mean and median values are 0.001–0.130 lower. Particularly, the medians for iPfam and 3 did are only decreased by 0.003 and 0.001, respectively. In summary, InterDom performs slightly better than DPEA, and both show better performance than LLZ.
Figure S11 contains the MFscore distributions of all data sets. Interestingly, the distributions for iPfam and 3did are quite distinct from the other distributions. Almost 80% of the domain interactions in iPfam or 3did have an MFscore >0.8, which indicates that related molecular functions are annotated to the interacting domains. In both data sets, domain interactions with very low MFscore are rare. The means of >0.8 and the medians of almost 1 corroborate this interpretation. The predictions made by InterDom and DPEA show similar distributions, but rather low means and medians. As in case of the BPscore distribution described earlier for LLZ, predictions made by LLZ show a lower MFscore. Again, the distributions obtained after excluding self-interactions are very similar to the other distributions (Fig. S12). The MFscore mean and median are 0.02–0.18 lower. InterDom has better performance than DPEA, and both perform better than LLZ.
3.4 Deriving confidence score thresholds
All prediction methods InterDom, LLZ and DPEA provide CSs for the predicted DDIs. However, in order to utilize sets of predicted interactions in practice, it is important to derive reasonable thresholds for low- and high-confidence sets of DDIs. It can be expected that the functional similarity of domains predicted to interact increases as the confidence of these predictions rises. To verify this expectation, we used different CS thresholds to calculate the GOscore means and medians of all interactions with a CS larger than the respective threshold. We also calculated the overlap of these interactions with iPfam and 3did.
Figure 3 shows the change in BPscore mean and median, and the change in data set size with varying CS threshold for the DPEA set. In case of DPEA, when raising the CS threshold from 3 to 6, the BPscore median increases from slightly >0.4 to almost 1 and the mean rises from 0.51 to
0.7. The MFscore median and the overlap with iPfam and 3did show a steep increase in this CS range (Figs. S13 and S14). Consequently, we suggest assigning predictions with a CS between 3 and 6, and >6 to DPEA subsets of low- and high-confidence DDIs, respectively.
|
The analysis of the InterDom set reveals that the BPscore median reaches 0.98 with a CS threshold of 30 (Fig. S15). The BPscore mean is 0.68 at this point and increases with higher thresholds. The same score development holds true for MFscore, but it is shifted slightly towards higher thresholds (Fig. S16). At a threshold of 60, the data set consists of 1888 interactions and the median increase diminishes. The overlap with iPfam and 3did increases with rising InterDom score and is about 27% for a threshold of 60 (Fig. S17). Altogether, these results suggest a threshold of 60 for InterDom predictions with high confidence.
The analysis of LLZ predictions reveals that the BPscore mean and median, and the overlap with iPfam and 3 did are very low over the whole CS range (Figs. S18–S20). These results do not allow for deriving any reasonable CS threshold for some LLZ subset of DDIs.
3.5 Comparing human protein interaction networks
We calculated the BPscore for all data sets of PPIs. Table 3 summarizes the results ranked by the average BPscore. The BPscore means range from 0.82 for Bioverse-core to 0.37 for the Wanker PPI set. While the average BPscores for the predicted data sets vary significantly, the experimental Y2H data sets have rather low mean BPscore. In contrast, predicted data sets such as both HiMAP sets and Bioverse-core as well as the manually curated sets HPRD and HTT-literature receive high mean scores. The different results for the HTT and ATX networks also indicate that literature-curated, carefully validated, PPIs reach a higher BPscore than PPIs derived by high-throughput experiments.
|
The BPscore means of the iPfam-derived IUP- and IPG-sets with the same PPIs, but distinct GO annotations, are 0.76 and 0.81, respectively. These values are lower than the mean of the corresponding DDIs in iPfam, which, in part, may be due to the fact that we excluded self-interactions in the two PPI sets. The IUP-set is also annotated with all available GO protein annotations and not only with GO annotations of the interacting domains as in case of the IPG-set. The score distributions for the IUP- and IPG-sets show that using the GO annotation of proteins or Pfam-A domains leads to different results (Fig. S21). One reason may be that Pfam-A domain annotations alone do not describe the complete protein functions. Therefore, if the same domain occurs in both interacting proteins and is responsible for the interaction, the calculated BPscore will be higher than in the IUP-set. In comparison, the manually curated HPRD set has a BPscore mean similarity measure of 0.66. The distribution of this data set shows that over 50% of the interactions have a BPscore above 0.7 (Fig. S21). However, 10% of the interactions have a score between 0.1 and 0.2. The consensus PPI sets ConSet2-4 show a similar mean BPscore, and ConSet5 and ConSet6 score higher, but they constitute small interaction data sets only.
Especially on the lower ranks, the BPscore ranking of the data sets is similar to rankings resulting from the computed HPRD or Y2H verification rate (Table 3), that is, the percentage of interactions contained in HPRD or the combined Y2H set Vidal & Wanker. The predicted Bioverse-core set and the consensus sets have the best verification rates with respect to HPRD. The fact that the Vidal and Wanker sets have published validation rates of 78% and 62–66%, respectively, agrees well with the slightly higher mean BPscore 0.47 of Vidal in contrast to the mean 0.36 of Wanker (Rual et al., 2005; Stelzl et al., 2005).
| 4 CONCLUSIONS |
|---|
|
|
|---|
Following the idea that interacting domains or proteins should have highly similar BP annotation and, to a smaller degree, similar MF annotation, we evaluated the functional similarity of three predicted and two experimental DDI networks as well as several predicted and experimental human PPI networks. Furthermore, we investigated to which extent predicted DDIs or PPIs overlap with experimentally derived interactions.
We demonstrated that the application of functional similarity measures is not restricted to the validation of PPIs (Guo et al., 2006), but also useful for DDIs. Our analysis of DDIs revealed that the BP similarity of interacting domains is generally higher than the corresponding MF similarity. This observed difference between BP and MF similarity agrees well with findings by Guo et al. for PPIs using other GO similarity measures. The difference may be partly due to the fact that interacting domains or proteins may perform different functions though they act in similar processes. Another reason may be that GO terms are more densely connected in the top levels of the BP ontology than of the MF ontology.
The iPfam-derived IUP- and IPG-sets encompass the same PPIs, but the IUP-set is annotated with the GO terms of the proteins in UniProt and the IPG-set with the GO terms of the interacting Pfam-A domains. The comparison of these two sets revealed that the BPscore results depend on the annotation used. This indicates that the choice of the annotation source contributes to the differing findings for DDIs and PPIs. Moreover, a higher number of proteins annotated with diverse BPs may decrease the mean BPscore of protein networks in contrast to sets of DDIs annotated with more generic GO terms.
In agreement with our results on human protein interaction networks, Reguly et al. (2006)s observed for yeast interaction data sets that the GO annotation of literature-curated PPI sets is more coherent than the GO annotation of high-throughput PPI sets. Since manually curated data sets of PPIs taken from scientific literature have a higher mean BPscore than most predicted and high-throughput data sets, the latter sets may contain a significant number of false interactions or a large amount of proteins involved in novel processes. This can lead to a considerable decrease in BPscore. Furthermore, proteins described in the literature may be annotated particularly well using GO. In contrast, most other, less characterized, proteins are annotated by automated and thus less reliable methods (Camon et al., 2003). Therefore, a more thorough analysis of the PPI results using alternative measures will be required to explain differences between predicted and experimental data sets.
Our functional similarity analysis in conjunction with an evaluation of the overlap between experimentally derived and predicted DDIs allowed the definition of CS thresholds for DDI prediction results. These thresholds are useful for improving PPI predictions based on DDIs as well as for assessing the confidence of PPIs derived by high-throughput experiments. In the future, incorporating other similarity criteria besides GO may improve the confidence assessment of predicted interactions further. As the coverage and quality of GO annotations improves, the importance of approaches that use functional similarity for the validation and prediction of PPIs and DDIs will increase.
| ACKNOWLEDGEMENTS |
|---|
|
|
|---|
We are grateful to Francisco S. Domingues and the anonymous reviewers for useful comments on the manuscript. Part of this study was financially supported by the German National Genome Research Network (NGFN) and by the German Research Foundation (DFG), contract number KFO 129/1-1. This work was conducted in the context of the BioSapiens Network of Excellence funded by the European Commission under grant number LSHG-CT-2003-503265.
Conflict of Interest: none declared.
| FOOTNOTES |
|---|
Associate Editor: Dmitrij Frishman
Received on October 31, 2006; revised on December 23, 2006; accepted on January 14, 2007
| REFERENCES |
|---|
|
|
|---|
Bork P, et al. Protein interaction networks from yeast to human. Curr. Opin. Struct. Biol, ( (2004) ) 14, : 292–299.[CrossRef][ISI][Medline].
Brown KR, Jurisica I. Online predicted human interaction database. Bioinformatics, ( (2005) ) 21, : 2076–2082.
Camon E, et al. The Gene Ontology Annotation (GOA) project: implementation of GO in SWISS-PROT, TrEMBL, and InterPro. Genome Res, ( (2003) ) 13, : 662–672.
Camon E, et al. The Gene Ontology Annotation (GOA) Database: sharing knowledge in Uniprot with Gene Ontology. Nucleic Acids Res, ( (2004) ) 32, : D262–D266.
Deng M, et al. Inferring domain-domain interactions from protein-protein interactions. Genome Res, ( (2002) ) 12, : 1540–1548.
Finn RD, et al. iPfam: visualization of protein-protein interactions in PDB at domain and amino acid resolutions. Bioinformatics, ( (2005) ) 21, : 410–412.
Finn RD, et al. Pfam: clans, web tools and services. Nucleic Acids Res, ( (2006) ) 34, : D247–D251.
Gene Ontology Consortium. The Gene Ontology (GO) project in 2006. Nucleic Acids Res, ( (2006) ) 34, : D322–D326.
Goehler H, et al. A protein interaction network links GIT1, an enhancer of huntingtin aggregation, to Huntington's disease. Mol. Cell, ( (2004) ) 15, : 853–865.[CrossRef][ISI][Medline].
Guo X, et al. Assessing semantic similarity measures for the characterization of human regulatory pathways. Bioinformatics, ( (2006) ) 22, : 967–973.
Huang TW, et al. POINT: a database for the prediction of protein-protein interactions based on the orthologous interactome. Bioinformatics, ( (2004) ) 20, : 3273–3276.
Lehner B, Fraser AG. A first-draft human protein-interaction map. Genome Biol, ( (2004) ) 5, : R63.[CrossRef][Medline].
Lim J, et al. A protein-protein interaction network for human inherited ataxias and disorders of Purkinje cell degeneration. Cell, ( (2006) ) 125, : 801–814.[CrossRef][ISI][Medline].
Lin D. An information-theoretic definition of similarity. ( (1998) ) Proceedings of the Fifteenth International Conference on Machine Learning: Madison, Wisconsin, USA. 296–304..
Lin N, et al. Information assessment on predicting protein-protein interactions. BMC Bioinformatics, ( (2004) ) 5, : 154.[CrossRef][Medline].
Liu Y, et al. Inferring protein-protein interactions through high-throughput interaction data from diverse organisms. Bioinformatics, ( (2005) ) 21, : 3279–3285.
Lu LJ, et al. Assessing the limits of genomic data integration for predicting protein networks. Genome Res, ( (2005) ) 15, : 945–953.
Maglott D, et al. Entrez Gene: gene-centered information at NCBI. Nucleic Acids Res, ( (2005) ) 33, : D54–D58.
McDermott J, et al. Functional annotation from predicted protein interaction networks. Bioinformatics, ( (2005) ) 21, : 3217–3226.
Mishra GR, et al. Human protein reference database - 2006 update. Nucleic Acids Res, ( (2006) ) 34, : D411–D414.
Mulder NJ, et al. The InterPro Database, 2003 brings increased coverage and new features. Nucleic Acids Res, ( (2003) ) 31, : 315–318.
Ng SK, et al. InterDom: a database of putative interacting protein domains for validating predicted protein interactions and complexes. Nucleic Acids Res, ( (2003) ) 31, : 251–254.
Persico M, et al. HomoMINT: an inferred human network based on orthology mapping of protein interactions discovered in model organisms. BMC Bioinformatics, ( (2005) ) 6, (Suppl 4): S21..
Reguly T, et al. Comprehensive curation and analysis of global interaction networks in Saccharomyces cerevisiae. J. Biol, ( (2006) ) 5, : 11.[CrossRef][Medline].
Resnik P. Using information content to evaluate semantic similarity in a taxonomy. ( (1995) ) Proceedings of the 14th International Joint Conference on Artificial Intelligence: Montreal, Quebec, Canada. 448–453..
Rhodes DR, et al. Probabilistic model of the human protein-protein interaction network. Nat. Biotechnol, ( (2005) ) 23, : 951–959.[CrossRef][ISI][Medline].
Riley R, et al. Inferring protein domain interactions from databases of interacting proteins. Genome Biol, ( (2005) ) 6, : R89.[CrossRef][Medline].
Rual JF, et al. Towards a proteome-scale map of the human protein-protein interaction network. Nature, ( (2005) ) 437, : 1173–1178.[CrossRef][Medline].
Schlicker A, et al. A new measure for functional similarity of gene products based on Gene Ontology. BMC Bioinformatics, ( (2006) ) 7, : 302.[CrossRef][Medline].
Sharan R, Ideker T. Modeling cellular machinery through biological network comparison. Nat. Biotechnol, ( (2006) ) 24, : 427–433.[CrossRef][ISI][Medline].
Stein A, et al. 3did: interacting protein domains of known three-dimensional structure. Nucleic Acids Res, ( (2005) ) 33, : D413–D417.
Stelzl U, et al. A human protein-protein interaction network: a resource for annotating the proteome. Cell, ( (2005) ) 122, : 957–968.[CrossRef][ISI][Medline].
Wojcik J, Schachter V. Protein-protein interaction map inference using interacting domain profile pairs. Bioinformatics, ( (2001) ) 17, (Suppl 1): S296–S305.[Abstract].
Wu CH, et al. The Universal Protein Resource (UniProt): an expanding universe of protein information. Nucleic Acids Res, ( (2006a) ) 34, : D187–D191.
Wu X, et al. Prediction of yeast protein-protein interaction network: insights from the Gene Ontology and annotations. Nucleic Acids Res, ( (2006b) ) 34, : 2137–2150.
Yu H, et al. Annotation transfer between genomes: protein-protein interologs and protein-DNA regulogs. Genome Res, ( (2004) ) 14, : 1107–1118.
This article has been cited by other articles:
![]() |
S.-E. Schelhorn, T. Lengauer, and M. Albrecht An integrative approach for predicting interactions of protein regions Bioinformatics, August 15, 2008; 24(16): i35 - i41. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Pandey, M. Koyuturk, S. Subramaniam, and A. Grama Functional coherence in domain interaction networks Bioinformatics, August 15, 2008; 24(16): i28 - i34. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Schlicker and M. Albrecht FunSimMat: a comprehensive functional similarity database Nucleic Acids Res., January 11, 2008; 36(suppl_1): D434 - D439. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||




