Bioinformatics Advance Access originally published online on March 12, 2008
Bioinformatics 2008 24(9):1129-1136; doi:10.1093/bioinformatics/btn099
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
A nearest neighbor approach for automated transporter prediction and categorization from protein sequences
Bioinformatics Lab, Plant Biology Division, The Samuel Roberts Noble Foundation, Inc., 2510 Sam Noble Parkway, Ardmore, OK 73401, USA
*To whom correspondence should be addressed.
| Abstract |
|---|
Motivation: Membrane transport proteins play a crucial role in the import and export of ions, small molecules or macromolecules across biological membranes. Currently, there are a limited number of published computational tools which enable the systematic discovery and categorization of transporters prior to costly experimental validation. To approach this problem, we utilized a nearest neighbor method which seamlessly integrates homologous search and topological analysis into a machine-learning framework.
Results: Our approach satisfactorily distinguished 484 transporter families in the Transporter Classification Database, a curated and representative database for transporters. A five-fold cross-validation on the database achieved a positive classification rate of 72.3% on average. Furthermore, this method successfully detected transporters in seven model and four non-model organisms, ranging from archaean to mammalian species. A preliminary literature-based validation has cross-validated 65.8% of our predictions on the 11 organisms, including 55.9% of our predictions overlapping with 83.6% of the predicted transporters in TransportDB.
Availability and Supplementary information: http://bioinfo.noble.org/manuscript-support/transporter/
Contact: pzhao{at}noble.org
Associate Editor: Burkhard Rost
Received on November 21, 2007; revised on March 10, 2008; accepted on March 11, 2008