Bioinformatics Advance Access originally published online on October 22, 2007
Bioinformatics 2007 23(23):3113-3118; doi:10.1093/bioinformatics/btm506
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
On the hierarchical classification of G protein-coupled receptors
1Edward Jenner Institute, Compton, Newbury, Berkshire, RG20 7NN, 2Department of Computing and Centre for BioMedical Informatics, University of Kent, Canterbury, Kent CT2 7NF and 3Departments of Computer Science and Electronics, University of York, Heslington, York YO10 5DD, UK
*To whom correspondence should be addressed.
| Abstract |
|---|
Motivation: G protein-coupled receptors (GPCRs) play an important role in many physiological systems by transducing an extracellular signal into an intracellular response. Over 50% of all marketed drugs are targeted towards a GPCR. There is considerable interest in developing an algorithm that could effectively predict the function of a GPCR from its primary sequence. Such an algorithm is useful not only in identifying novel GPCR sequences but in characterizing the interrelationships between known GPCRs.
Results: An alignment-free approach to GPCR classification has been developed using techniques drawn from data mining and proteochemometrics. A dataset of over 8000 sequences was constructed to train the algorithm. This represents one of the largest GPCR datasets currently available. A predictive algorithm was developed based upon the simplest reasonable numerical representation of the protein's physicochemical properties. A selective top-down approach was developed, which used a hierarchical classifier to assign sequences to subdivisions within the GPCR hierarchy. The predictive performance of the algorithm was assessed against several standard data mining classifiers and further validated against Support Vector Machine-based GPCR prediction servers. The selective top-down approach achieves significantly higher accuracy than standard data mining methods in almost all cases.
Contact: m.davies{at}mail.cryst.bbk.ac.uk
Associate Editor: sJohn Quackenbush
Received on July 23, 2007; revised on September 10, 2007; accepted on October 3, 2007