Bioinformatics Advance Access originally published online on June 17, 2009
Bioinformatics 2009 25(17):2174-2180; doi:10.1093/bioinformatics/btp366
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Automated protein (re)sequencing with MS/MS and a homologous database yields almost full coverage and accuracy
1 David R. Cheriton School of Computer Science, University of Waterloo, Waterloo, 2 Department of Computer Science, University of Western Ontario, London and 3 Bioinformatics Solutions, Inc., Waterloo, Canada
* To whom correspondence should be addressed.
| Abstract |
|---|
Motivation: The bottom-up tandem mass spectrometry (MS/MS) is regularly used in proteomics nowadays for identifying proteins from a sequence database. De novo sequencing software is also available for sequencing novel peptides with relatively short sequence lengths. However, automated sequencing of novel proteins from MS/MS remains a challenging problem.
Results: Very often, although the target protein is novel, it has a homologous protein included in a known database. When this happens, we propose a novel algorithm and automated software tool, named Champs, for sequencing the complete protein from MS/MS data of a few enzymatic digestions of the purified protein. Validation with two standard proteins showed that our automated method yields >99% sequence coverage and 100% sequence accuracy on these two proteins. Our method is useful to sequence novel proteins or re-sequence a protein that has mutations comparing with the database protein sequence.
Availability: The software, named Champs (Complete Homology-Assisted Ms/ms Protein Sequencing), and the MS/MS data used in the article, are freely available at http://monod.uwaterloo.ca/champs/.
Contact: binma{at}uwaterloo.ca
Supplementary information: Supplementary data are available at Bioinformatics online.
Associate Editor: Alex Bateman
Received on March 22, 2009; revised on May 17, 2009; accepted on June 9, 2009