Bioinformatics Vol. 16 no. 9 2000
Pages 767-775
© 2000 Oxford University Press
Original Paper |
Identification of novel multi-transmembrane proteins from genomic databases using quasi-periodic structural properties
1 Department of Ecology and Evolutionary
Biology,
2 Department of Statistics,
3 Department of Molecular, Cellular and
Developmental Biology, Yale University, USA
Received on August 25, 1999
; revised on April 14, 2000
; accepted on May 10, 2000
Motivation: Identification of novel G protein-coupled receptors and other multi-transmembrane proteins from genomic databases using structural features.
Results: Here we describe a new algorithm for identifying multi-transmembrane proteins from genomic databases with a specific application to identifying G protein-coupled receptors (GPCRs) that we call quasi-periodic feature classifier (QFC). The QFC algorithm uses concise statistical variables as the feature space to characterize the quasi-periodic physico-chemical properties of multi-transmembrane proteins. For the case of identifying GPCRs, the variables are then used in a non-parametric linear discriminant function to separate GPCRs from non-GPCRs. The algorithm runs in time linearly proportional to the number of sequences, and performance on a test dataset shows 96% positive identification of known GPCRs. The QFC algorithm also works well with short random segments of proteins and it positively identified GPCRs at a level greater than 90% even with segments as short as 100 amino acids. The primary advantage of the algorithm is that it does not directly use primary sequence patterns which may be subject to sampling bias. The utility of the new algorithm has been demonstrated by the isolation from the Drosophila genome project database of a novel class of seven-transmembrane proteins which were shown to be the elusive olfactory receptor genes of Drosophila.
Availability: C++/Perl available from http://jkim.eeb.yale.edu/index.html
Contact: Junhyong Kim, Dept. of Ecology and Evolutionary Biology, Yale University, P.O. Box 208106, New Haven, CT 06520-8106; junhyong.kim{at}yale.edu
Supplementary information: Test dataset and training dataset are available from http://jkim.eeb.yale.edu/index.html
**** To whom correspondence should be addressed.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
S. Guo and J. Kim Molecular Evolution of Drosophila Odorant Receptor Genes Mol. Biol. Evol., May 1, 2007; 24(5): 1198 - 1207. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Wistrand, L. Kall, and E. L.L. Sonnhammer A general model of G protein-coupled receptor sequences and its application to detect remote homologs Protein Sci., March 1, 2006; 15(3): 509 - 521. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. M. Robertson, C. G. Warr, and J. R. Carlson Molecular evolution of the insect chemoreceptor gene superfamily in Drosophila melanogaster PNAS, November 25, 2003; 100(suppl_2): 14537 - 14542. [Abstract] [Full Text] |
||||
![]() |
C. A. Hill, A. N. Fox, R. J. Pitts, L. B. Kent, P. L. Tan, M. A. Chrystal, A. Cravchik, F. H. Collins, H. M. Robertson, and L. J. Zwiebel G Protein-Coupled Receptors in Anopheles gambiae Science, October 4, 2002; 298(5591): 176 - 178. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Kim and J. R. Carlson Gene discovery by e-genetics: Drosophila odor and taste receptors J. Cell Sci., March 15, 2002; 115(6): 1107 - 1112. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. R. Carlson Functional expression of a Drosophila odor receptor PNAS, July 31, 2001; 98(16): 8936 - 8937. [Full Text] [PDF] |
||||
![]() |
C. Warr, P. Clyne, M. de Bruyne, J. Kim, and J. R. Carlson Olfaction in Drosophila: Coding, Genetics and e-Genetics Chem Senses, February 1, 2001; 26(2): 201 - 206. [Abstract] [Full Text] [PDF] |
||||





