Bioinformatics Advance Access originally published online on November 7, 2008
Bioinformatics 2009 25(1):14-21; doi:10.1093/bioinformatics/btn569
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Discovery of phosphorylation motif mixtures in phosphoproteomics data
1Department of Computer Science, Brown University, 2Toyota Technological Institute at Chicago, Chicago, IL, 3Department of Chemistry and Molecular Biology, Cell Biology, and Biochemistry and 4Center for Computational Molecular Biology, Brown University, Providence, RI, USA
*To whom correspondence should be addressed.
| Abstract |
|---|
Motivation: Modification of proteins via phosphorylation is a primary mechanism for signal transduction in cells. Phosphorylation sites on proteins are determined in part through particular patterns, or motifs, present in the amino acid sequence.
Results: We describe an algorithm that simultaneously discovers multiple motifs in a set of peptides that were phosphorylated by several different kinases. Such sets of peptides are routinely produced in proteomics experiments.Our motif-finding algorithm uses the principle of minimum description length to determine a mixture of sequence motifs that distinguish a foreground set of phosphopeptides from a background set of unphosphorylated peptides. We show that our algorithm outperforms existing motif-finding algorithms on synthetic datasets consisting of mixtures of known phosphorylation sites. We also derive a motif specificity score that quantifies whether or not the phosphoproteins containing an instance of a motif have a significant number of known interactions. Application of our motif-finding algorithm to recently published human and mouse proteomic studies recovers several known phosphorylation motifs and reveals a number of novel motifs that are enriched for interactions with a particular kinase or phosphatase. Our tools provide a new approach for uncovering the sequence specificities of uncharacterized kinases or phosphatases.
Availability: Software is available at http:/cs.brown.edu/people/braphael/software.html.
Contact: aritz{at}cs.brown.edu; braphael{at}cs.brown.edu
Supplementary information: Supplementary data are available at Bioinformatics online.
Associate Editor: Burkhard Rost
Received on August 1, 2008; revised on October 24, 2008; accepted on October 28, 2008