Skip Navigation


Bioinformatics Advance Access originally published online on November 25, 2004
Bioinformatics 2005 21(8):1693-1694; doi:10.1093/bioinformatics/bti161
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow All Versions of this Article:
21/8/1693    most recent
bti161v2
bti161v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (2)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Li, J. B.
Right arrow Articles by Stormo, G. D.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Li, J. B.
Right arrow Articles by Stormo, G. D.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2004. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions{at}oupjournals.org

Procom: a web-based tool to compare multiple eukaryotic proteomes

Jin Billy Li 1,*,{dagger}, Miao Zhang 1,2,{dagger}, Susan K. Dutcher 1 and Gary D. Stormo 1

1Department of Genetics, Washington University School of Medicine St. Louis, MO 63110, USA
2Department of Biomedical Engineering, Washington University St. Louis, MO 63130, USA

*To whom correspondence should be addressed.


    Abstract
 TOP
 Abstract
 REFERENCES
 

Summary: Each organism has traits that are shared with some, but not all, organisms. Identification of genes needed for a particular trait can be accomplished by a comparative genomics approach using three or more organisms. Genes that occur in organisms without the trait are removed from the set of genes in common among organisms with the trait. To facilitate these comparisons, a web-based server, Procom, was developed to identify the subset of genes that may be needed for a trait.

Availability: The Procom program is freely available with documentation and examples at http://ural.wustl.edu/~billy/Procom/

Contact: billy{at}ural.wustl.edu

Comparative genomics has proven extremely powerful in several aspects of genomic sciences that include gene prediction and regulatory element identification (Ureta-Vidal et al., 2003). Most comparative genomics studies focus on finding features in common among diverse organisms. Comparisons of closely related organisms often reveal too many candidates to narrow down the gene list of interest. Identification of genes that are retained in some organisms, but are lost in others, also provides important additional information. This comparison is useful when the genes encode proteins associated with a trait that is specific to only a subset of the organisms. Recently, a collection of genes enriched for ciliary and basal body proteins was obtained by comparing proteomes of ciliated organisms, which include Caenorhabditis elegans, Chlamydomonas reinhardtii, Ciona intestinalis, Drosophila melanogaster, Homo sapiens and Mus musculus, and non-ciliated organisms, which include Arabidopsis thaliana and Saccharomyces cerevisiae (Avidor-Reiss et al., 2004; Li et al., 2004). This set of proteins greatly facilitated positional cloning of a new human disease gene associated with ciliary or basal body defects by narrowing down 230 candidate genes to only 2 (Li et al., 2004). This accomplishment prompted the development of Procom (Proteome comparison), a generalized web-based tool to compare eukaryotic proteomes; this tool will facilitate other comparisons by this method.

All predicted proteins from the genomes of 30 completely, or nearly completely, sequenced eukaryotic organisms are used for comparison (see Fig. 1 for list). More organisms will be added as the sequences are completed. The proteomes were pair-wise compared with each other using WU-BLASTP (W. Gish, http://blast.wustl.edu) with the threshold E-value = 1 to produce 870 BLASTP output files. To accelerate the subsequent proteome–proteome comparisons, each of these files was parsed to retrieve the query and subject names, and the E-values.



View larger version (71K):
[in this window]
[in a new window]
 
Fig. 1 Snapshot of the Procom interface. The output is the ANCHOR proteins that have matches in all of the chosen INTERSECTION organisms, but do not have matches in any of the chosen SUBTRACTION organisms.

 
The user specifies three classes of organisms for comparisons; they are the anchor, intersection and subtraction organisms (Fig. 1). The anchor organism serves as the query. The intersection organisms (0 or more) are used to identify the set of shared proteins, and the subtraction organisms (0 or more) are used to exclude proteins that are shared among the three classes of organisms. The user specifies the BLASTP E-values, which can be different for the intersection and subtraction classes. The low intersection E-value and high subtraction E-value will generate a stringent list of output proteins. In contrast, the high intersection E-value and low subtraction E-value will impose loose criteria.

For each of the selected intersection and subtraction organisms, the previously parsed pair-wise BLASTP file with the anchor as query is retrieved to obtain the protein IDs with an E-value less than specified by the user. The collections of the protein IDs are compared with each other to obtain the overlap for intersection organisms and remove the overlap for subtraction organisms. The output of Procom is the protein IDs of the anchor organism. The user can request both the protein IDs and the corresponding sequences. A link for each protein is provided to relevant databases.

The Procom program is written in Perl using the CGI module. It is user friendly and takes no more than 1 min for any combination of comparisons.

Procom should allow users to identify a set of proteins that may be associated with a trait of interest. The proteins associated with the trait must be conserved among organisms retaining the trait, but must be missing in organisms lacking the trait. Procom is by no means a comprehensive tool for comparative genomics analysis, but it provides a novel strategy to compartmentalize candidate genes by the disparity among the proteomes of similar and dissimilar organisms. For example, one could identify photosynthesis candidate genes by searching for Arabidopsis proteins that have homologs in rice and the green alga Chlamydomonas, but not in animals. Fungal specific genes could be enriched in a set of proteins that are shared by S.cerevisiae, Schizosaccharomyces pombe, Aspergillus nidulans, Cryptococcus neoformans, Encephalitozoon cuniculi and Neurospora crassa, but are missing in the remaining organisms.


    Acknowledgments
 
We thank various genome sequencing consortia for access of the sequences and Dr Warren Gish for providing some resources used in this work. This work was supported by National Institutes of Health HG-00249 (to GDS) and GM-32843 (to SKD). JBL was supported in part by the Monsanto Fellowship at Washington University.


    Footnotes
 
{dagger}The authors wish it to be known that, in their opinion, the first two authors should be regarded as joint First Authors. Back

Received on October 7, 2004; revised on November 9, 2004; accepted on November 17, 2004

    REFERENCES
 TOP
 Abstract
 REFERENCES
 

    Avidor-Reiss, T., Maer, A.M., Koundakjian, E., Polyanovsky, A., Keil, T., Subramaniam, S., Zuker, C.S. (2004) Decoding cilia function: defining specialized genes required for compartmentalized cilia biogenesis. Cell, 117, 527–539[CrossRef][Web of Science][Medline].

    Li, J.B., Gerdes, J.M., Haycraft, C.J., Fan, Y., Teslovich, T.M., May-Simera, H., Li, H., Blacque, O.E., Li, L., Leitch, C.C., et al. (2004) Comparative genomics identifies a flagellar and basal body proteome that includes the BBS5 human disease gene. Cell, 117, 541–552[CrossRef][Web of Science][Medline].

    Ureta-Vidal, A., Ettwiller, L., Birney, E. (2003) Comparative genomics: genome-wide analysis in metazoan eukaryotes. Nat. Rev. Genet., 4, 251–262[Web of Science][Medline].


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Nucleic Acids ResHome page
B. E. Dutilh, Y. He, M. L. Hekkelman, and M. A. Huynen
Signature, a web server for taxonomic characterization of sequence samples using signature genes
Nucleic Acids Res., July 1, 2008; 36(suppl_2): W470 - W474.
[Abstract] [Full Text] [PDF]


Home page
J. Cell Sci.Home page
D. M. Baron, K. S. Ralston, Z. P. Kabututu, and K. L. Hill
Functional genomics in Trypanosoma brucei identifies evolutionarily conserved components of motile flagella
J. Cell Sci., February 1, 2007; 120(3): 478 - 491.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow All Versions of this Article:
21/8/1693    most recent
bti161v2
bti161v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (2)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Li, J. B.
Right arrow Articles by Stormo, G. D.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Li, J. B.
Right arrow Articles by Stormo, G. D.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?