Bioinformatics Advance Access originally published online on May 26, 2005
Bioinformatics 2005 21(15):3312-3313; doi:10.1093/bioinformatics/bti507
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
SECISDesign: a server to design SECIS-elements within the coding sequence
Friedrich-Schiller-University Jena, Institute of Computer Science, Chair for Bioinformatics Ernst-Abbe-Platz 2, 07743 Jena, Germany
*To whom correspondence should be addressed.
| Abstract |
|---|
|
|
|---|
Summary: SECISDesign is a server for the design of SECIS-elements and arbitrary RNA-elements within the coding sequence of an mRNA. The element has to satisfy both structure and sequence constraints. At the same time, a certain amino acid similarity to the original protein has to be kept. The designed sequence can be used for recombinant expression of selenoproteins in Escherichia coli.
Availability: The server is available at http://www.bio.inf.uni-jena.de/Software/SECISDesign/index.html
Contact: backofen{at}inf.uni-jena.de
| MOTIVATION |
|---|
|
|
|---|
Selenoproteins contain the 21st amino acid selenocysteine. Since selenocysteine is encoded by the stop-codon UGA, its insertion additionally depends on a specific mRNA sequence and structure downstream the UGA (called SEC Insertion Sequence, SECIS). Selenoproteins have gained much interest, since they are of fundamental importance to human health and an essential component of several major metabolic pathways, such as antioxidant defence systems, the thyroid hormone metabolism and the immune function (for a review see Brown and Arthur, 2001). For this reason, there is an enormous interest in the catalytic properties of selenoproteins, especially since a selenoprotein greatly enhances enzymatic activity compared with its cysteine homologue.
Detailed biochemical investigation of selenoproteins requires the production of a sufficient amount of pure protein, for which an Escherichia coli-based recombinant expression system is often used. A problem arises for eukaryotic selenoproteins, since the selenocysteine insertion mechanisms differ between E.coli and eukaryotes. In eukaryotes, the SECIS-element is located in the 3'-untranslated region. In contrast, the SECIS of E.coli must follow the UGA immediately, i.e. it is located in the protein coding sequence.
Therefore, recombinant expression of selenoproteins is complicated and rarely successful. This results in a low amount of pure protein, which complicates biochemical analyses (e.g. structure determination). Furthermore, there are merely few cases of successive heterologous expression (e.g. Bar-Noy and Moskovitz, 2002). All of them required a careful, hand-crafted design of the nucleotide sequence. The design of SECIS-elements is a crucial step for the expression of selenoproteins in E.coli.
By and large, the situation is as follows. First, an eukaryotic selenoprotein cannot be expressed directly in the E.coli system, since it requires an appropriate SECIS-element directly after the UGA. This is not present in the eukaryotic gene. The same holds for a protein that naturally contains a cysteine, which we want to replace by a selenocysteine. Second, the design of a new SECIS is likely to change the protein sequence. Therefore, one has to make a compromise between changes in the protein sequence and the efficiency of selenocysteine insertion. To ensure high efficiency of a designed SECIS-element, we optimize its similarity to the given element and its stability. The former considers structure and sequence, the latter is assessed by the free energy of the mRNA.
| THE SERVER |
|---|
|
|
|---|
SECISDesign is a server for designing SECIS-elements. Our method consists of two parts: mRNA structure optimization and an heuristic approach using inverse RNA folding, i.e. the design of RNA sequences that fold into a given structure with high probability.
The input of the algorithm divides into the structure of the SECIS-element, its nucleotide sequence and the amino acid sequence in which we wish to insert selenocysteine. Some parts of the structure are fixed, whereas others only improve the efficiency of the element. Therefore, these features can be declared as being optional. The process of inserting a SECIS-element poses the problem of finding an mRNA sequence that contains an efficient SECIS at the right position. In addition, we also have to guarantee that the encoded amino acid sequence has a high similarity to the original one (concerning BLOSUM62 or PAM250). Often, some amino acids must not be changed in order to preserve the biological function of the protein. Therefore, the user can specify such fixed positions.
During the first step of SECISDesign, a dynamic programming approach designs an mRNA that is optimal with regard to two aspects. First, it can fold into the target structure and has maximal similarity to a given SECIS-sequence. Second, it encodes an amino acid sequence that is maximally similar to the original one. Besides, it preserves fixed positions. Even if the nucleotide and amino acid conditions are contradictory, a solution will be found by allowing insertions and deletions on the amino acid level. For more details see Backofen and Busch (2004).
In a second step, SECISDesign tries to optimize the free energy of the target structure to increase the folding probability while the constraints from the first step are retained. To this end, an inverse RNA folding is initialized with the sequence resulting from the first step and is carried out by one of the following local search approaches: adaptive walk (AW), full local search (FLS) and stochastic local search (SLS). They differ in the way to accept or decline a newly found solution as a replacement of the previous one. This is known as search criterion. During this search, a further nucleic mutation will be accepted, if it has a foldability that is allowed according to the search criterion and if a certain similarity on the amino acid level is kept. One can choose from the following methods to evaluate the RNA foldability: minimizing the distance of the minimum free energy structure of the designed sequence and the target structure (mfe-mode), maximizing the probability of the sequence folding into the desired structure (pf-mode) (Hofacker et al., 1994), and minimizing the average number of incorrect paired nucleotides (nc-mode) (Dirks et al., 2004). To evaluate the energy and the foldings, we use the Vienna RNA Package (Hofacker et al., 1994). In most cases, the best results were found by using FLS and a combination of the mfe-mode and either pf- or nc-mode.
Finally, we get a sequence which folds into the SECIS-structure with a high probability and guarantees a given minimum similarity to both the typical SECIS-sequence and the original amino acid sequence.
Usage. In order to obtain an mRNA sequence coding for a selenocysteine containing protein, the SECISDesign server requires the amino acid sequence and the position, where the selenocysteine should be inserted. In addition, the user can give information about positions whose allowed changes are restricted to certain amino acids. Moreover, the target element can be selected from a list of SECIS-elements of E.coli. Although the server is tailored for SECIS, the method is not restricted to the motifs of the list. The user can define new and even unrelated elements. Finally, one can select the search strategy, the evaluation of foldability and further parameters that are used during the second step. To facilitate the use of SECISDesign, we set recommended values by default. Since the computation usually takes only
1 min, the user can test several parameters to find a solution that fits his requirements the best.
| RESULTS |
|---|
|
|
|---|
Bar-Noy and Moskovitz (2002) have studied the mammalian methionine sulfoxide reductase B (MsrB). In order to express this selenoprotein in E.coli, they designed an SECIS by hand that gives an amino acid similarity score of 35 (BLOSUM62) but has an unpaired lower part of the stem (Fig. 1). The application of SECISDesign leads to better results concerning the quality of the SECIS-element (since the bases of the lower part of the stem are paired), the probability of folding into the target structure, and often, also concerning the similarity score (Fig. 1).
|
Conflict of Interest: none declared.
Received on March 29, 2005; accepted on May 17, 2005
| REFERENCES |
|---|
|
|
|---|
Backofen, R. and Busch, A. (2004) Computational design of new and recombinant selenoproteins. Proceedings of the 15th Annual Symposium on Combinatorial Pattern Matching (CPM2004).
Bar-Noy, S and Moskovitz, J. (2002) Mouse methionine sulfoxide reductase B: effect of selenocysteine incorporation on its activity and expression of the seleno-containing enzyme in bacterial and mammalian cells. Biochem. Biophys. Res. Commun., 297, , pp. 956961[CrossRef][Web of Science][Medline].
Brown, K.M. and Arthur, J.R. (2001) Selenium, selenoproteins and human health: a review. Public Health Nutr., 4, 593599[Medline].
Dirks, R.M. (2004) Paradigms for computational nucleic acid design. Nucleic Acids Res., 32, 13921403
Hofacker, I.L. (1994) Fast folding and comparison of RNA secondary structures. Monatshefte Chem., 125, 167188[CrossRef].
This article has been cited by other articles:
![]() |
C. Aldag, I. A. Gromov, I. Garcia-Rubio, K. von Koenig, I. Schlichting, B. Jaun, and D. Hilvert Probing the role of the proximal heme ligand in cytochrome P450cam by recombinant incorporation of selenocysteine PNAS, April 7, 2009; 106(14): 5481 - 5486. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Gursinsky, D. Grobe, A. Schierhorn, J. Jager, J. R. Andreesen, and B. Sohling Factors and Selenocysteine Insertion Sequence Requirements for the Synthesis of Selenoproteins from a Gram-Positive Anaerobe in Escherichia coli Appl. Envir. Microbiol., March 1, 2008; 74(5): 1385 - 1393. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

, FLS (default values for other parameters).
