Bioinformatics Vol. 15 no. 12 1999
Pages 1028-1038
© 1999 Oxford University Press
PHYSEAN: PHYsical SEquence ANalysis for the identification of protein domains on the basis of physical and chemical properties of amino acids
1 SmithKline Beecham Pharmaceuticals, Bioinformatics Department, King of Prussia, PA 19406-0939, USA and Research Group for Evolutionary Genetics, Hungarian Academy of Sciences and Eötvös University, Nádor u. 7, Budapest, H-1051, Hungary
Motivation: PHYSEAN predicts protein classes with highly variable sequences on the basis of their physical, chemical and biological characteristics such as diverse hydrophobicity, structural propensity and steric properties. These characteristics, calculated from multiple positions in a sequence, may be conserved even between sequences that fail to produce alignments at any acceptable level of statistical significance. PHYSEAN complements methods that require sequence alignments (BLAST, FASTA, dynamic programming) by adding less residue- and position-specific physicochemical information on the protein or the domain.
Results: We predict proteins or their domains like signal peptides using physical, chemical, geometric, and biological properties of the 20 amino acids. This comprehensive set of properties may cover the diagnostic functional and structural aspects of a domain or a protein class. We automatically select and weight a subset of properties so as to discriminate between, e.g., signal peptides and amino-termini of cytosolic proteins with the lowest number of incorrect predictions. This optimal selection of properties and their weights significantly decreases the number of incorrect predictions as compared to any single property or any combination of unweighted properties. Weights have been optimized by high-performance linear programming models that systematically find the optimal solution from among an astronomic number of property/weight combinations. PHYSEANs performance is demonstrated by highly accurate predictions of signal peptides (the vehicles for protein transport across membranes) and their cleavage sites. The results indicate reliable predictions are possible even in the lack of sequence conservation using an automated physical and chemical analysis of proteins.
Availability: The source code for the prediction program will be available for collaborators.
Contact: Steve_Ladunga{at}sbphrd.com
Received on December 9, 1998
; revised on June 14, 1999
; accepted on June 24, 1999
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
K. Frank and M. J. Sippl High-performance signal peptide prediction based on sequence alignment techniques Bioinformatics, October 1, 2008; 24(19): 2172 - 2176. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. Ladunga More complete gene silencing by fewer siRNAs: transparent optimized design and biophysical signature Nucleic Acids Res., January 28, 2007; 35(2): 433 - 440. [Abstract] [Full Text] [PDF] |
||||
![]() |
K.-C. Chou Using subsite coupling to predict signal peptides Protein Eng. Des. Sel., February 1, 2001; 14(2): 75 - 79. [Abstract] [Full Text] [PDF] |
||||


