Bioinformatics Advance Access originally published online on November 5, 2004
Bioinformatics 2005 21(7):961-968; doi:10.1093/bioinformatics/bti126
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
A simple statistical method for discriminating outer membrane proteins with better accuracy
Computational Biology Research Center (CBRC), National Institute of Advanced Industrial Science and Technology (AIST) Aomi Frontier Building 17F, 2-43 Aomi, Koto-ku, Tokyo 135-0064, Japan
*To whom correspondence should be addressed.
| Abstract |
|---|
|
|
|---|
Motivation: Discriminating outer membrane proteins from other folding types of globular and membrane proteins is an important task both for identifying outer membrane proteins from genomic sequences and for the successful prediction of their secondary and tertiary structures.
Results: We have systematically analyzed the amino acid composition of globular proteins from different structural classes and outer membrane proteins. We found that the residues, Glu, His, Ile, Cys, Gln, Asn and Ser, show a significant difference between globular and outer membrane proteins. Based on this information, we have devised a statistical method for discriminating outer membrane proteins from other globular and membrane proteins. Our approach correctly picked up the outer membrane proteins with an accuracy of 89% for the training set of 337 proteins. On the other hand, our method has correctly excluded the globular proteins at an accuracy of 79% in a non-redundant dataset of 674 proteins. Furthermore, the present method is able to correctly exclude
-helical membrane proteins up to an accuracy of 80%. These accuracy levels are comparable to other methods in the literature, and this is a simple method, which could be used for dissecting outer membrane proteins from genomic sequences. The influence of protein size, structural class and specific residues for discrimination is discussed.
Availability: A program for the discrimination method is available upon request from the corresponding author. The datasets used in this work are available at http://www.cbrc.jp/~gromiha/omp/dataset.html
Contact: michael-gromiha{at}aist.go.jp
| INTRODUCTION |
|---|
|
|
|---|
During the last two decades we have been witnessing exciting advances in the field of membrane proteins. The three-dimensional structures of membrane proteins revealed the existence of two structural motifs,
-helices and ß-barrels in these proteins. In recent years, the crystal structures of several
-helical membrane proteins have been solved at high resolution and the features influencing their structure, folding and stability have been studied in detail (White and Wimley, 1999; Gromiha, 1999). Unlike
-helical membrane proteins, the structure and function of ß-barrel membrane proteins are not well understood. ß-barrel membrane proteins are found in the outer membranes of bacteria, mitochondria and chloroplast (Schulz, 2002). These proteins differ from the all-ß structural class of globular proteins due to the presence of a lipid environment and they have different structural motifs compared with
-helical membrane proteins. The discrimination of ß-barrel membrane proteins [outer membrane proteins (OMPs)] is an important task and it can be utilized in two ways: (i) the successful prediction of membrane spanning ß-strand segments and modeling OMPs, and (ii) dissecting OMPs from genomic sequences. A comparative analysis of the distribution of amino acid residues in
-helical and ß-barrel membrane proteins shows that the membrane part of OMP is more complex than transmembrane helical proteins due to the intervention of many charged and polar residues in the membrane. Consequently, the success rate of discriminating transmembrane helical proteins from other proteins is significantly higher than that of ß-barrel membrane proteins (Hirokawa et al., 1998). Several methods have been proposed for predicting the structural classes of globular proteins and discriminating inner membrane proteins with high accuracy. These methods include discriminant analysis of amino acid residues (Klein, 1986) amino acid composition (Chou and Zhang, 1992) physicalchemical properties (Gromiha and Ponnuswamy, 1995; Bu et al., 1999) component coupled method (Chou and Maggiora, 1998) residue distribution along the sequence (Kumarevel et al., 2000) Bayes decision rule (Wang and Yuan, 2000) and amphiphilicity index of amino acid residues (Mitaku et al., 2002). On the other hand, only a few methods have been reported that identify ß-barrel membrane proteins and transmembrane ß-barrels in proteomes (Wimley, 2002; Martelli et al., 2002; Liu et al., 2003; Bigelow et al., 2004). Gnanasekaran et al. (2000) developed a structure-based sequence alignment method for identifying ß-stranded OMPs. Wimley (2002) analyzed the architecture of 15 OMPs and proposed a method based on hydrophobicity for identifying ß-barrel membrane proteins in genomic sequences. Martelli et al. (2002) used 12 OMPs and developed a neural network method for picking up the ß-barrel membrane proteins. Liu et al. (2003) analyzed the amino acid composition in the membrane spanning regions of 12 ß-barrel membrane proteins and applied the information for discrimination. Bagos et al. (2004a,b) developed an algorithm based on the Hidden Markov Model (HMM) for discriminating OMPs. Natt et al. (2004) used a set of 16 OMPs and proposed a machine learning technique for discrimination. All these methods use minimal information for the analysis and the prediction accuracy is rather modest.
In our earlier works, we have proposed different methods for predicting the membrane spanning ß-strand segments in OMPs (Gromiha and Ponnuswamy, 1993; Gromiha et al., 1997, Gromiha et al., 2004). These methods and other algorithms for locating the transmembrane ß-strands are applicable only to OMPs and hence a simple and promising method is necessary to discriminate the OMPs from other proteins. In this work, we have used a large dataset and computed the amino acid composition for 20 amino acid residues in both globular and OMPs. We have systematically analyzed the differences and similarities in these groups of proteins and devised a statistical method based on the deviation of amino acid composition for discriminating OMPs. We have tested our approach with several sets of globular proteins belonging to four different structural classes, transmembrane helical proteins, and OMPs obtained from both well annotated sequences and known three dimensional structures. Our predicted results showed an accuracy of 84% for correctly picking up the OMPs from known annotated sequences. Our method is able to exclude up to 80% of globular proteins and
-helical membrane proteins. These accuracy levels are comparable to or better than other methods in the literature.
| SYSTEMS AND METHODS |
|---|
|
|
|---|
Datasets
We used several sets of data for calculating the amino acid composition, identifying OMPs and excluding globular and
-helical transmembrane proteins. Firstly, we used a dataset of 377 annotated OMPs and 674 globular proteins belonging to all structural classes for computing the amino acid composition for the 20 amino acid residues. The well-annotated OMPs were obtained from the PSORT-B database (Gardy et al., 2003) which contain several homologous sequences. The 674 globular protein chains were extracted from the PDB40D_1.37 database of SCOP with a sequence identity of less than 30% (Murzin et al., 1995; Berman et al., 2000). This dataset was used by Wang and Yuan (2000) for predicting the structural classes of globular proteins and it is very challenging for discriminative purposes; methods claiming 100% accuracy for structural class prediction, predicted only with the accuracy of 60% with this dataset (Wang and Yuan, 2000). Globular proteins dataset contains 155 all-
proteins, 156 all-ß proteins, 184
+ ß proteins and 179
/ß proteins. These two sets of 674 globular proteins and 377 OMPs were also used for validating the present method. Further, we tested our method with three other datasets: (i) a subset of 27 non-redundant OMP sequences with less than 35% sequence identity: (ii) a non-redundant dataset of 19 OMPs available in Protein Data Bank with a sequence identity of less than 25% (Berman et al., 2000) and (iii) a dataset of 268 well-annotated
-helical transmembrane proteins (Gardy et al., 2003). The amino acid sequences of all these five sets of data are available on the web at http://www.cbrc.jp/~gromiha/omp/dataset.html
Back-check and validity check methods
We used the dataset of 377 OMP sequences and 674 globular protein sequences for deriving the amino acid composition. These same proteins were used to predict whether each protein is of globular or outer membrane type. This method is called back-check prediction (or self-consistency test).
For the validity check prediction, we followed the procedures that are widely used in the literature for protein secondary structure and solvent accessibility predictions: a set of N proteins is split into equally balanced subsets; parameters are developed on M proteins and then tested on the remaining N M proteins (Cuff and Barton, 1999; Ahmad and Gromiha, 2002); the procedure is repeated for all subsets of data to obtain the average accuracy. We used 189 outer membrane and 337 globular proteins (Set A) to derive the amino acid composition and the result obtained with Set A was used to discriminate the remaining proteins in the dataset (188 outer membrane and 337 globular proteins, Set B). The same procedure was repeated by keeping Set B as the training and Set A as test set. Further, we shuffled the sequences in the whole dataset of globular and OMPs, separately, and divided the proteins into training (Set A1) and test sets (Set B1) as described above. Using Set A1 and Set B1, we repeated the calculation. This procedure was repeated several times to validate the performance of our method. This type of test is known as validity-check prediction. In addition, we tested each OMP using the amino acid composition computed on a set that does not contain any homologous sequences. Further, we performed a jack-knife test using a dataset of 27 OMP sequences, which have less than 35% sequence identity with each other. We computed the amino acid composition using 26 OMPs and used this information for assigning the type of the left-out protein. In these procedures, the tested proteins contain no information about the training set and hence the prediction accuracy obtained with this method is reliable.
We have also examined the reliability of our method with two other datasets in which no information is used to derive the amino acid composition. These datasets include (i) 19 non-redundant OMPs and (ii) 268
-helical transmembrane proteins.
Computation of amino acid composition
The amino acid composition for the set of OMPs was computed using the number of amino acids of each type and the total number of residues. It is defined as:
![]() | (1) |
Discrimination of OMPs
We have calculated the amino acid composition for both globular proteins (Compglob) and OMPs (CompOMP). For a new protein, X, firstly, we calculated the amino acid composition using Equation (1). Then we calculated the total absolute difference of amino acid composition between protein X and the amino acid composition of globular proteins, and that between protein X and OMPs. The protein X is predicted to be an OMP (globular protein) if the deviation is lowest with CompOMP (Compglob).
| RESULTS AND DISCUSSION |
|---|
|
|
|---|
Amino acid composition for the 20 amino acid residues in globular proteins and OMPs
The amino acid composition for the 20 amino acid residues in globular proteins and OMPs have been computed using Equation (1) and the results are displayed in Figure 1a. We observed that the residues Glu, His, Ile, Cys, Gln, Asn and Ser show subtle difference between the composition of globular proteins and OMPs. While the composition of Glu, His, Ile and Cys are higher in globular proteins than OMPs, an opposite trend is observed for Ser, Asn and Gln. The formation of disulfide bonds between Cys residues requires an oxidative environment and such disulfide bridges are not usually found in intracellular proteins (Branden and Tooze, 1999). Analysis of the three-dimensional structures of 15 ß-barrel OMPs shows the presence of just eight (0.1%) Cys residues and none of them are in the membrane part (Gromiha and Suwa, 2003). Hence, the occurrence of Cys is significantly higher in globular proteins than in OMPs. Glu is a strong helix former (Chou and Fasman, 1978) and this tendency influences the higher occurrence of it in globular proteins than OMPs. The comparative analysis of the occurrence of Ile in the ß-strand segments of globular proteins and OMPs revealed that the preference of Ile in OMPs is less than that in globular proteins (Gromiha and Suwa, 2003) which may increase the occurrence of it in globular proteins.
|
On the other hand, the composition of the residues, Ser, Asn and Gln, is significantly higher in OMPs than in globular proteins (Fig. 1a). The structural analysis of several OMPs shows that these residues play an important role in the stability and function of OMPs. In OmpA, the interiors of ß-strands contain an extended hydrogen bonding network of charged and polar residues and, especially, the side chains of the residues Ser22, Gln228 and Asn258 in OmpT, located above the membrane, form hydrogen bonds to main chain atoms in the ß-barrel. Interestingly, none of the residues, which have high composition in globular proteins (Glu, His, Ile and Cys), are involved in such patterns (Pautsch and Schulz, 2000; Vandeputte-Rutten et al., 2001). The binding of cyanocobalamin (CN-Cbl) with BtuB is important for its function. The binding region of this protein, hatch domain, is dominated by the residues Ser, Gln and Asn, which form van der Waals and hydrogen bonding interactions to stabilize the hatch apices. Especially, the residues Asn185 and Asn276 are important for the stability of the upper surface of the CN-Cbl binding pocket (Chimento et al., 2003a,b). In FecA, Yue et al. (2003) showed that the binding pockets for diferric dicitrate involve the hydrogen bonds from the three residues, Gln176, Gln570 and Asn721. Similar observations are also reported for the recognition of TonB with FepA (Buchanan et al., 1999) and hydrolysis of a substrate binding to outer membrane phospholipase A (Snijder et al., 1999). It has also been reported that the replacement of His100 by Asn increased the stability in OmpX (Vogt and Schulz, 1999). Further, the structure and function of a peptideprotein complex of Omp32 is mainly achieved by the interaction of eight residues in the peptide, which are dominated by Asn, Gln and Ser (Zeth et al., 2000). We infer from these observations that the high occurrence of Ser, Asn and Gln in OMPs is required in the formation of ß-barrel structures in the membrane, stability of binding pockets and the function of OMPs. The analysis of the three-dimensional structures of OMPs also revealed the higher occurrence of Ser, Asn and Gln in OMPs than in globular proteins.
We have also analyzed whether any patterns exist between amino acid compositional differences and their characteristics. The difference between the compositions of each amino acid residue in globular and OMPs is plotted in Fig. 1b. In this figure, the amino acids are placed in the order of aliphatic, aromatic, sulfur-containing, polar and charged residues. The residues with a positive difference indicate their high preference in globular proteins and those with a negative value show their dominance in OMPs. We observed that Asp, Phe, Val and Trp did not show significant differences between globular and OMPs. Among the hydrophobic residues, none shows a specific preference for globular or OMPs except Ile (left-hand side of Fig. 1b). The presence of sulfur-containing residues, Cys and Met, is higher in globular proteins than in OMPs. The polar residues have a higher occurrence in OMPs than in globular proteins. The influence of charged residues is interesting; Glu has significantly higher occurrence in globular proteins than in OMPs while Asp has no preference. On the other hand, Lys and His occur more often in globular proteins than in OMPs while an opposite trend is observed for Arg. The occurrence of Pro is less in OMPs and it might be due to the fact that Pro is not a favored residue in membrane environments (Deber et al., 1990; Gromiha et al., 1997; Gromiha, 1999.
Discrimination of globular proteins and OMPs
We have calculated the amino acid composition for each of the 674 globular proteins and 377 OMPs using Equation 1 (Table 1). For each protein, we have calculated the total deviation in the amino acid compositions of 20 amino acid residues from the average values given in Table 1. As an example, in Table 2 we present the amino acid composition of a typical globular protein and OMP, adenovirus DNA binding protein (1ADT
[PDB]
) and OutD protein, respectively. In this table, we have included the deviations from globular proteins (
glob) and OMPs (
OMP) for all the 20 amino acid residues and the total deviation. For 1ADT, the deviation of amino acid composition from globular protein (34.18) is less than that from OMP (39.89) and hence this protein is predicted as a non-OMP, as known from its structural information (Tucker et al., 1994). On the other hand, for OutD protein, the deviation from OMP (16.09) is less than that from globular protein (23.70) and hence it is identified as an OMP, showing agreement with experimental observations. The total deviations for a sample set of 40 globular proteins belonging to all four structural classes and 10 OMPs are given in Table 3. We have correctly identified 334 out of 377 OMPs (89%) and excluded 531 of 674 globular proteins (79%). The validity-check method described earlier yielded an average accuracy of 84 and 78%, respectively, for correctly identifying OMPs and excluding globular proteins. The cross-validation method tested for each OMP using the frequency computed with a set of non-homologous sequences showed an accuracy of 80% for correctly assigning OMPs. We have carried out a jack-knife test using a dataset of 27 non-redundant OMP sequences and the OMPs were discriminated at an accuracy of 93%. Further, the present method performed very well in discriminating different sets of ß-barrel porins,
-helical membrane proteins and aquaporins (typical
-helical membrane proteins in which one transmembrane segment does not penetrate through the membrane) from the Transport Classification Database (Busch and Saier, 2002). It correctly identified 91% of the 85 ß-barrel porins and excluded 74% of 19 aquaporins and 88% of 16
-helical membrane proteins.
|
|
|
Analysis based on different structural classes and proteins of different size
We have further analyzed the prediction results based on different structural classes, all-
, all-ß,
+ ß and
/ß. We observed that the prediction accuracies for these four classes of proteins are, respectively, 83, 67, 77 and 86%, indicating the better performance of all-
and
/ß proteins. The accuracy of the all-ß structural class of proteins is improved to 73% when the proteins belonging to this class alone are used to compute the amino acid composition. Furthermore, we divided the proteins based on their size and we observed that the proteins with less than 300 residues are correctly excluded from OMPs with an accuracy of about 80%, and the proteins of large size are excluded at an accuracy of 87%. Proteins that have 301400 residues are excluded with 74% accuracy. In OMPs, the large-size proteins (more than 800 residues) are correctly identified with an accuracy of 97% and the proteins with more than 200 residues are picked up with about 88% accuracy. Proteins with 200 residues or less are predicted with an accuracy of 73%.
Prediction results for
-helical transmembrane proteins and OMPs of known structure
We have calculated the amino acid composition for a set of 268 well-annotated
-helical transmembrane proteins, which are not used in the training data. We have successfully eliminated 213 of the 268
-helical transmembrane proteins and the accuracy is 80%. When tested against a set of aquaporins and
-helical membrane proteins in the Transport Classification Database, this method correctly predicted them to not be OMPs.
Furthermore, we have set up a representative set of 19 non-redundant OMPs. The amino acid composition and the total deviation from globular proteins and OMPs for all these 19 OMPs are presented in Table 4. Our method has picked up 18 of these 19 OMPs with an accuracy of 95%. The missed protein (1MM4) is of small size (170 residues) and it has more than 10% helical content.
|
Influence of specific residues for discrimination
We have analyzed the influence of seven amino acid residues (Glu, His, Ile, Cys, Gln, Asn and Ser) that show significantly different abundance between globular proteins and OMPs, i.e., the residues which are rich in globular/OMPs (Table 1 and Fig. 1a). Interestingly, only using these seven amino acid residues, we could discriminate the OMPs with the same accuracy as when using all the 20 amino acid residues (89 and 95% in the dataset of well-annotated sequences and known three-dimensional structures, respectively). However, the exclusion of globular and
-helical membrane proteins is slightly less than that with all residues, 5% in the case of globular proteins and 1% in
-helical membrane proteins (the accuracy levels are 74 and 79%, respectively).
From Table 1 we observed that the amino acid compositions of Ser, Asn, Gln, Thr, Gly, Tyr, Ala, Arg and Leu are higher in OMP than in globular proteins. Using these nine residues, we have tried to discriminate outer membrane proteins. As expected, these residues could successfully omit globular and
-helical membrane proteins (81 and 91%, respectively). However, the identification of OMPs from annotated sequences and known three-dimensional structures are rather moderate (71 and 79% respectively). On the other hand, the residues which have higher amino acid composition in globular proteins than OMP, picked up the OMPs with high accuracy whereas the exclusion of globular and
-helical membrane proteins is poor.
We have also examined the accuracy of discriminating OMPs using the amino acid composition of each of the 20 amino acid residues. In this method, we used the composition of only one amino acid residue at a time for discrimination and the calculation has been repeated for 20 times. The accuracy of discriminating OMPs using each amino acid residue is presented in Table 5. We found that most of the residues have a predictive ability of 4065%. Cys is the best predictor of OMPs and the residues Glu, His, Ile and Ser identified the OMPs with the accuracy of more than 70%. Interestingly, these five residues are among the seven residues, which showed significant difference between the compositions of globular and OMPs. On the other hand, Ser has the highest accuracy of excluding globular proteins followed by Asn and these two residues have subtle differences between the compositions of globular proteins and OMPs. Further, we noticed that Ser has the ability of predicting OMPs and excluding globular proteins at an accuracy level of about 70%. It might be due to the highest difference of it between globular and OMPs (Fig. 1b).
|
Deviation versus correlation
In the present method, we have used the measure of deviation for discriminating OMPs from other globular and membrane proteins. We have also examined the influence of correlation (instead of deviation) for discrimination. For each protein, we have calculated the correlation coefficients with the amino acid compositions of globular proteins and OMPs. The target protein is of the outer membrane type if it has a higher correlation coefficient with OMP than globular protein and vice versa. This type of approach has been previously used to distinguish the structural class of globular proteins (Chou and Zhang, 1992). We observed that the measure, deviation, could discriminate the OMPs better than the correlation coefficient. The predicted accuracies for discriminating OMPs in a set of 377 annotated sequences, 15 OMPs used by Wimley (2002) and 19 non-redundant OMPs are, respectively, 84, 93 and 95%. On the other hand, the accuracies of correctly excluding globular and inner membrane proteins are 77 and 84%, respectively.
Effect of dataset on discrimination
We have examined the relative importance of the present method and the information gained from the dataset for discriminating OMPs, whether the dataset or the method is responsible for high accuracy. We have set up several datasets of randomly selected globular proteins and OMPs with different sizes and applied our method for predicting OMPs and excluding globular proteins. The present method based on the deviation of amino acid composition correctly identified the OMPs with an average accuracy of 87% (in the range of 8589% in ten different sets of data) and excluded the globular proteins at 75% accuracy. These accuracy levels are marginally lower than that predicted with the original dataset (674 globular and 377 OMPs) used in the present work (89 and 79%, respectively for OMPs and globular proteins). On the other hand, as we discussed in the previous section, the method based on the correlation coefficient using the original dataset showed a reduced accuracy of 84 and 77%, respectively, for predicting OMPs and excluding globular proteins. Hence, from this analysis we observed that the effective combination of our methodology as well as the information gained from the selection of a good dataset significantly improved the accuracy of discriminating OMPs.
Comparison with other methods
Gnanasekaran et al. (2000) devised a method based on sequence alignment profiles of porins to identify the ß-stranded OMPs and reported an accuracy of about 80%. Liu et al. (2003) proposed a method based on the amino acid composition of residues in transmembrane ß-strand segments to discriminate ß-barrel membrane proteins. They used just 12 proteins for developing the parameters and tested with 241 OMPs, and the accuracy was reported to be 84%. As the membrane spanning segments are used to compute the amino acid composition, this method could identify the outer membrane proteins, which have a high content of amino acid residues in the membrane and it missed the proteins with fewer membrane spanning ß-strand segments. As an example, this method failed to identify the 7AHL (
-hemolysin), which has two membrane spanning ß-strands, to be an OMP. Liu et al. (2003) compared the ability of the method proposed by Wimley (2002) and reported an accuracy of 53% to discriminate ß-barrel membrane proteins. For the structural data used in Wimley (2002) the accuracy is 75% whereas our method identified all the OMPs in the same dataset and the accuracy is 100%. Martelli et al. (2002) devised a neural network method using 12 OMPs and tested the method in 145 OMPs, which has yielded the accuracy of 84%. (Bagos et al. (2004a,b) used an HMM for discriminating ß-barrel OMPs and reported an accuracy of 88% for a set of 133 OMPs. We have used a set of 377 OMPs and discriminated them with an accuracy of 88%. Further, the present method correctly excluded 80% of the transmembrane
-helical proteins whereas the accuracy is 73% using HMM (Bagos et al., 2004a, b). The high accuracy achieved by the present method is due to the superiority of the method as well as the information gained from the large dataset of globular proteins and OMPs. As the sequence information alone is sufficient to derive the composition, one can refine the parameters frequently, as the number of sequences is growing rapidly, which may improve the accuracy. Further, the present method is simple and easy to incorporate in any algorithm.
| CONCLUSIONS |
|---|
|
|
|---|
We have systematically analyzed the amino acid sequences of globular proteins and OMPs and developed the amino acid composition parameters for these classes of proteins. The similarities and differences of the 20 amino acid residues between globular proteins and OMPs have been brought out. Based on the results, we have devised a statistical method based on the deviation of the amino acid composition to identify the OMPs and to exclude other proteins. Our method correctly identified 84% of the OMPs and excluded up to 80% of the globular proteins,
-helical membrane proteins and aquaporins. These accuracy levels are comparable to or better than other methods in the literature. We suggest that this simple method could be effectively used to discriminate OMPs and for detecting OMPs in genomic sequences.
| Acknowledgments |
|---|
We sincerely thank the referees for constructive comments. We acknowledge Dr. Yutaka Akiyama for encouragement, Dr. Tamotsu Noguchi for the OMP structure data, Dr. Taishin Kin and Mr. Hideki Nagasaki for useful discussions and technical help and Dr. Paul Horton for reading the manuscript.
Received on June 24, 2004; revised on September 8, 2004; accepted on October 20, 2004
| REFERENCES |
|---|
|
|
|---|
Ahmad, S. and Gromiha, M.M. (2002) NETASA: neural network based prediction of solvent accessibility. Bioinformatics, 18, 819824
Bagos, P.G., Liakopoulos, T.D., Spyropoulos, I.C., Hamodrakas, S.J. (2004a) A Hidden Markov Model method, capable of predicting and discriminating beta-barrel outer membrane proteins. BMC Bioinformatics, 5, 29[CrossRef][Medline].
Bagos, P.G., Liakopoulos, T.D., Spyropoulos, I.C., Hamodrakas, S.J. (2004b) PRED-TMBB: a web server for predicting the topology of beta-barrel outer membrane proteins. Nucleic Acids Res., 32, W400W404
Berman, H.M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T.N., Weissig, H., Shindyalov, I.N., Bourne, P.E. (2000) The Protein Data Bank. Nucleic Acids Res., 28, 235242
Bigelow, H.R., Petrey, D.S., Liu, J., Przybylski, D., Rost, B. (2004) Predicting transmembrane beta-barrels in proteomes. Nucleic Acids Res., 32, 25662577
Branden, C. and Tooze, C. Introduction to Protein Structure, (1999) , New York Garland Publishing Inc.
Bu, W.S., Feng, Z.P., Zhang, Z., Zhang, C.T. (1999) Prediction of protein (domain) structural classes based on amino-acid index. Eur. J. Biochem., 266, , pp. 10431049[ISI][Medline].
Buchanan, S.K., Smith, B.S., Venkatramani, L., Xia, D., Esser, L., Palnitkar, M., Chakraborty, R., van der Helm, D., Deisenhofer, J. (1999) Crystal structure of the outer membrane active transporter FepA from Escherichia coli. Nat. Struct. Biol., 6, 5663[CrossRef][ISI][Medline].
Busch, W. and Saier, M.H., Jr. (2002) The transporter classification (TC) system, 2002. Crit. Rev. Biochem. Mol. Biol., 37, 287337[CrossRef][ISI][Medline].
Chimento, D.P., Mohanty, A.K., Kadner, R.J., Wiener, M.C. (2003a) Substrate-induced transmembrane signaling in the cobalamin transporter BtuB. Nat. Struct. Biol., 10, 394401[CrossRef][ISI][Medline].
Chimento, D.P., Kadner, R.J., Wiener, M.C. (2003b) The Escherichia coli outer membrane cobalamin transporter BtuB: structural analysis of calcium and substrate binding, and identification of orthologous transporters by sequence/structure conservation. J. Mol. Biol., 332, 9991014[CrossRef][ISI][Medline].
Chou, P.Y. and Fasman, G.D. (1978) Prediction of the secondary structure of proteins from their amino acid sequence. Adv. Enzymol., 47, 45148[Medline].
Chou, K.C. and Maggiora, G.M. (1998) Domain structural class prediction. Protein Eng., 11, 523538
Chou, K.C. and Zhang, C.T. (1992) A correlation-coefficient method to predicting protein-structural classes from amino acid compositions. Eur. J. Biochem., 207, 429433[ISI][Medline].
Cuff, J.A. and Barton, G.J. (1999) Evaluation and improvement of multiple sequence methods for protein secondary structure prediction. Proteins, 34, 508519[CrossRef][ISI][Medline].
Deber, C.M., Glibowicka, M., Woolley, G.A. (1990) Conformations of proline residues in membrane environments. Biopolymers, 29, 149157[CrossRef][ISI][Medline].
Gardy, J.L., Spencer, C., Wang, K., Ester, M., Tusnady, G.E., Simon, I., Hua, S., deFays, K., Lambert, C., Nakai, K., Brinkman, F.S. (2003) PSORT-B: Improving protein subcellular localization prediction for Gram-negative bacteria. Nucleic Acids Res., 31, 36133617
Gnanasekaran, T.V., Peri, S., Arockiasamy, A., Krishnaswamy, S. (2000) Profiles from structure based sequence alignment of porins can identify beta stranded integral membrane proteins. Bioinformatics, 16, 839842
Gromiha, M.M. (1999) A simple method for predicting transmembrane alpha helices with better accuracy. Protein Eng., 12, 55761
Gromiha, M.M. and Ponnuswamy, P.K. (1993) Prediction of transmembrane beta-strands from hydrophobic characteristics of proteins. Int. J. Pept. Protein Res., 42, 420431[ISI][Medline].
Gromiha, M.M. and Ponnuswamy, P.K. (1995) Prediction of protein secondary structures from their hydrophobic characteristics. Int. J. Pept. Protein Res., 45, 225240[ISI][Medline].
Gromiha, M.M. and Suwa, M. (2003) Variation of amino acid properties in all-beta globular and outer membrane protein structures. Int. J. Biol. Macromol., 32, 9398[CrossRef][ISI][Medline].
Gromiha, M.M., Majumdar, R., Ponnuswamy, P.K. (1997) Identification of membrane spanning beta strands in bacterial porins. Protein Eng., 10, 497500
Gromiha, M.M., Ahmad, S., Suwa, M. (2004) Neural network-based prediction of transmembrane beta-strand segments in outer membrane proteins. J. Comput. Chem., 25, 762767[CrossRef][ISI][Medline].
Hirokawa, T., Boon-Chieng, S., Mitaku, S. (1998) SOSUI: classification and secondary structure prediction system for membrane proteins. Bioinformatics, 14, 378379
Klein, P. (1986) Prediction of protein structural class by discriminant analysis. Biochim. Biophys. Acta., 874, 205215[CrossRef][Medline].
Kumarevel, T.S., Gromiha, M.M., Ponnuswamy, M.N. (2000) Structural class prediction: an application of residue distribution along the sequence. Biophys. Chem., 88, 81101[CrossRef][ISI][Medline].
Liu, Q., Zhu, Y., Wang, B., Li, Y. (2003) Identification of beta-barrel membrane proteins based on amino acid composition properties and predicted secondary structure. Comput. Biol. Chem., 27, 355361[CrossRef][ISI][Medline].
Martelli, P.L., Fariselli, P., Krogh, A., Casadio, R. (2002) A sequence-profile-based HMM for predicting and discriminating beta barrel membrane proteins. Bioinformatics, 18, S46S53[Abstract].
Mitaku, S., Hirokawa, T., Tsuji, T. (2002) Amphiphilicity index of polar amino acids as an aid in the characterization of amino acid preference at membrane-water interfaces. Bioinformatics, 18, 608616
Murzin, A.G., Brenner, S.E., Hubbard, T., Chothia, C. (1995) SCOP: a structural classification of proteins database for the investigation of sequences and structures. J. Mol. Biol., 247, 536540[CrossRef][ISI][Medline].
Natt, N.K., Kaur, H., Raghava, G.P. (2004) Prediction of transmembrane regions of beta-barrel proteins using ANN- and SVM-based methods. Proteins, 56, 1118[CrossRef][ISI][Medline].
Pautsch, A. and Schulz, G.E. (2000) High-resolution structure of the OmpA membrane domain. J. Mol. Biol., 298, 273282[CrossRef][ISI][Medline].
Schulz, G.E. (2002) The structure of bacterial outer membrane proteins. Biochim. Biophys. Acta., 1565, 308317[Medline].
Snijder, H.J., Ubarretxena-Belandia, I., Blaauw, M., Kalk, K.H., Verheij, H.M., Egmond, M.R., Dekker, N., Dijkstra, B.W. (1999) Structural evidence for dimerization-regulated activation of an integral membrane phospholipase. Nature, 401, 717721[CrossRef][Medline].
Tucker, P.A., Tsernoglou, D., Tucker, A.D., Coenjaerts, F.E., Leenders, H., van der Vliet, P.C. (1994) Crystal structure of the adenovirus DNA binding protein reveals a hook-on model for cooperative DNA binding. EMBO J., 13, 29943002[ISI][Medline].
Vandeputte-Rutten, L., Kramer, R.A., Kroon, J., Dekker, N., Egmond, M.R., Gros, P. (2001) Crystal structure of the outer membrane protease OmpT from Escherichia coli suggests a novel catalytic site. EMBO J., 20, 50335039[CrossRef][ISI][Medline].
Vogt, J. and Schulz, G.E. (1999) The structure of the outer membrane protein OmpX from Escherichia coli reveals possible mechanisms of virulence. Structure, 7, 13011309[Medline].
Wang, Z.X. and Yuan, Z. (2000) How good is prediction of protein structural class by the component-coupled method?. Proteins, 38, 165175[CrossRef][ISI][Medline].
White, S.H. and Wimley, W.C. (1999) Membrane protein folding and stability: physical principles. Annu. Rev. Biophys. Biomol. Struct., 28, 319365[CrossRef][ISI][Medline].
Wimley, W.C. (2002) Toward genomic identification of beta-barrel membrane proteins: composition and architecture of known structures. Protein Sci., 11, 301312
Yue, W.W., Grizot, S., Buchanan, S.K. (2003) Structural evidence for iron-free citrate and ferric citrate binding to the TonB-dependent outer membrane transporter FecA. J. Mol. Biol., 332, 353368[CrossRef][ISI][Medline].
Zeth, K., Diederichs, K., Welte, W., Engelhardt, H. (2000) Crystal structure of Omp32, the anion-selective porin from Comamonas acidovorans, in complex with a periplasmic peptide at 2.1 A resolution. Structure, 8, 981992[Medline].
This article has been cited by other articles:
![]() |
A. Randall, J. Cheng, M. Sweredoski, and P. Baldi TMBpro: secondary structure, {beta}-contact and tertiary structure prediction of transmembrane {beta}-barrel proteins Bioinformatics, February 15, 2008; 24(4): 513 - 520. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. M. Gromiha, Y. Yabuki, S. Kundu, S. Suharnan, and M. Suwa TMBETA-GENOME: database for annotated {beta}-barrel membrane proteins in genomic sequences Nucleic Acids Res., January 12, 2007; 35(suppl_1): D314 - D316. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Waldispuhl, B. Berger, P. Clote, and J.-M. Steyaert transFold: a web server for predicting the structure and residue contacts of transmembrane beta-barrels. Nucleic Acids Res., July 1, 2006; 34(Web Server issue): W189 - W193. [Abstract] [Full Text] [PDF] |
||||
![]() |
K.-J. Park, M. M. Gromiha, P. Horton, and M. Suwa Discrimination of outer membrane proteins using support vector machines Bioinformatics, December 1, 2005; 21(23): 4223 - 4229. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. M. Gromiha, S. Ahmad, and M. Suwa TMBETA-NET: discrimination and prediction of membrane spanning {beta}-strands in outer membrane proteins Nucleic Acids Res., July 1, 2005; 33(suppl_2): W164 - W167. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||



