Bioinformatics Advance Access originally published online on May 19, 2005
Bioinformatics 2005 21(15):3320-3321; doi:10.1093/bioinformatics/bti504
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sequence to Structure (S2S): display, manipulate and interconnect RNA data from sequence to structure
Institut de Biologie Moléculaire et Cellulaire du CNRS, UPR9002, Université Louis Pasteur F-67084 Strasbourg, France
*To whom correspondence should be addressed.
| Abstract |
|---|
|
|
|---|
Summary: Efficient RNA sequence manipulations (such as multiple alignments) need to be constrained by rules of RNA structure folding. The structural knowledge has increased dramatically in the last years with the accumulation of several large RNA structures similar to those of the bacterial ribosome subunits. However, no tool in the RNA community provides an easy way to link and integrate progress made at the sequence level using the available three-dimensional information. Sequence to Structure (S2S) proposes a framework in which an user can easily display, manipulate and interconnect heterogeneous RNA data, such as multiple sequence alignments, secondary and tertiary structures. S2S has been implemented using the Java language and has been developed and tested under UNIX systems, such as Linux and MacOSX.
Availability: S2S is available at http://bioinformatics.org/S2S/
Contact: f.jossinet{at}ibmc.u-strasbg.fr
The RNA database has advanced and is still advancing faster than the bioinformatics tools adapted to manage the diversity and the amount of generated data. Many of the bioinformatics tools developed for proteins are not well suited to RNA data. Consequently, the situation is reminiscent of the early days of protein sequence analysis. This is conspicuous for multiple sequence alignment, which is one of the most used and most valuable bioinformatics techniques. In general, all the automatic alignments produced with RNA sequences need to be highly improved by manual editing. Since RNA tertiary structure is more conserved than the sequence, a tool allowing RNA sequence manipulations controlled by the constraints deduced from the structural data is needed. Sequence to Structure (S2S) can be described as a framework containing four interconnected graphical tools (Fig. 1): a core tool (S2SViewer), a multiple sequence alignment editor (Rnalign), a secondary structure editor (Rna2DViewer) and a tertiary viewer (Rna3DViewer). Each tool has been designed to connect it to the others and to manipulate easily and efficiently the data displayed. Detailed program instructions and technical informations can be found in the documentation provided with S2S and the attached website.
|
The S2S workflow starts from a RNA tertiary structure stored in a Protein Data Bank (PDB) file (Berman et al., 2000). A secondary structure is automatically calculated from the tertiary structure using an accessory program. S2S uses by default the RNAVIEW algorithm, which deduces helices, single strands and base pairs from coordinates (Yang et al., 2003). The base pairs are identified in agreement with the LeontisWesthof classification (Leontis and Westhof, 2001). All the data parsed and generated are stored in a tree data structure named S2SView displayed by the core tool S2SViewer (Fig. 1). S2SViewer allows to directly manipulate the raw data or to launch more specialized S2S visualization tools such as Rna2DViewer and Rna3DViewer for secondary and tertiary structures, respectively.
Rna2DViewer was formerly known as RnaMLView (Yang et al., 2003). Its main goal is to provide fine control over the display of all the basebase interactions discovered with RNAVIEW. These interactions are placed on the secondary structure, the drawing of which can be manipulated by moving and rotating helices. Since its first version, Rna2DViewer has been modified in order to be integrated easily within S2S. In addition, it has been upgraded to supply more functionalities.
Since people are used to visualizing tertiary structures with their favorite tool, we have decided to design Rna3DViewer as a wrapper to quickly integrate external applications. By default, the Rna3DViewer tool is connected to the PyMOL application (http://pymol.sourceforge.net/). The documentation supplied with S2S is available on the website giving all the technical details to connect other 3D viewers to S2S.
Starting from this structural context, S2S allows to align external sequences against each RNA molecule found in the tertiary structure (called the reference RNA molecule). Each alignment can be manually edited inside the Rnalign tool. As with other multiple sequence alignment editors, one can add/delete gaps and move residues for one or several sequences. However, Rnalign offers two ways to help the user to check if the modifications are in agreement with the structural context.
In a first approach, the user can choose one among several structural masks. A structural mask colors each sequence residue according to its corresponding position in one of the two bracket notations. In addition, a structural mask can display for each sequence in the multiple alignment the conservation for the base pairs coded in the bracket notations (Fig. 1). The residue letters are displayed as dot characters if they establish, with their partner, an interaction isosteric with the one observed in the reference RNA molecule. The partner residue is deduced from the bracket notation chosen with the structural mask. The isostericity test is made against the 4x 4 matrices described by Leontis et al. (2002). Other structural mask behaviors are provided within Rnalign and explained in the documentation provided with S2S and the attached website.
In a second approach, in order to take into account the tertiary interactions, the reference RNA molecule embedded in a multiple alignment is linked to its counterparts in the secondary and tertiary views. Consequently, at any moment during the manual editing, by clicking on a specific position, S2S highlights in the Rna2DViewer and/or the Rna3DViewer the structural region around the selected position one wishes to align.
S2S supports the RnaML file format to save the multiple sequence alignments and secondary structures produced during a working session. The RnaML format is the only one file able to store the heterogeneous data manipulated inside S2S (Waugh et al., 2002). To avoid to generate a huge file, the tertiary coordinates are kept in the original PDB file. Its name is registered as a comment in the RnaML file. Besides the RnaML file format, the Rnalign and Rna2DViewer tools support output of their display to SVG files compatible with more specialized drawing tools, such as the Adobe Illustrator. Finally, Rnalign can generate a FASTA output.
Although starting from a PDB file is the best way to use all the S2S functionalities, it is also possible to use Rna2DViewer linked to Rnalign starting from an Mfold RNA secondary structure (registered as an RnaML file) (Zuker, 2003) or to reduce the use of S2S to Rnalign only starting from a multiple sequence alignment stored in aFASTA file.
Finally, in order to make it easily upgradable, a Jython (http://www.jython.org/) scripting engine has been embedded within S2S. Using this scripting language and the S2S API specifications available on the website, small scripts can be written and launched from S2S to easily extend the functionalities of the framework between two new major releases. Some scripts are provided with S2S as examples.
| Acknowledgments |
|---|
The authors thank Andreas Werner for corrections to the documentation of the software. This work was supported by a grant from the Action Concertée Incitative Informatique, Mathématique et Physique en Biologie Moléculaire (Ministère de la RechercheCNRS) (IMPBio: http://impbio.lirmm.fr/).
Conflict of Interest: none declared.
Received on March 14, 2005; revised on May 17, 2005; accepted on May 17, 2005
| REFERENCES |
|---|
|
|
|---|
Berman, H.M., et al. (2000) The Protein Data Bank. Nucleic Acids Res., 28, 235242
Leontis, N.B., et al. (2002) The non-WatsonCrick base pairs and their associated isostericity matrices. Nucleic Acids Res., 30, 34973531
Leontis, N.B. and Westhof, E. (2001) Geometric nomenclature and classification of RNA base pairs. RNA, 7, 499512[Abstract].
Robertson, M.P., et al. (2005) The structure of a rigorously conserved RNA element within the SARS virus genome. PLoS Biol., 3, e5[CrossRef][Medline].
Waugh, A., et al. (2002) RNAML: a standard syntax for exchanging RNA information. RNA, 8, 707717[Abstract].
Yang, H., et al. (2003) Tools for the automatic identification and classification of RNA base pairs. Nucleic Acids Res., 31, 34503460
Zuker, M. (2003) Mfold web server for nucleic acid folding and hybridization prediction. Nucleic Acids Res., 31, 34063415
This article has been cited by other articles:
![]() |
T. Geissmann, C. Chevalier, M.-J. Cros, S. Boisset, P. Fechter, C. Noirot, J. Schrenzel, P. Francois, F. Vandenesch, C. Gaspin, et al. A search for small noncoding RNAs in Staphylococcus aureus reveals a conserved sequence motif for regulation Nucleic Acids Res., September 28, 2009; (2009) gkp668v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Messmer, J. Putz, T. Suzuki, T. Suzuki, C. Sauter, M. Sissler, and F. Catherine Tertiary network in mammalian mitochondrial tRNAAsp revealed by solution probing and phylogeny Nucleic Acids Res., September 18, 2009; (2009) gkp697v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. W. Brown, A. Birmingham, P. E. Griffiths, F. Jossinet, R. Kachouri-Lafond, R. Knight, B. F. Lang, N. Leontis, G. Steger, J. Stombaugh, et al. The RNA structure alignment ontology RNA, September 1, 2009; 15(9): 1623 - 1631. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Zhou and Z. Shang 2D molecular graphics: a flattened world of chemistry and biology Brief Bioinform, May 1, 2009; 10(3): 247 - 258. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Djelloul and A. Denise Automated motif extraction and classification in RNA tertiary structures RNA, December 1, 2008; 14(12): 2489 - 2497. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Smit, K. Rother, J. Heringa, and R. Knight From knotted to nested RNA structures: A variety of computational methods for pseudoknot removal RNA, March 1, 2008; 14(3): 410 - 416. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Das and D. Baker Automated de novo prediction of native-like RNA tertiary structures PNAS, September 11, 2007; 104(37): 14664 - 14669. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Laederach Informatics challenges in structured RNA Brief Bioinform, September 1, 2007; 8(5): 294 - 303. [Abstract] [Full Text] [PDF] |
||||
![]() |
V. Lisi and F. Major A comparative analysis of the triloops in all high-resolution RNA structures reveals sequence structure relationships RNA, September 1, 2007; 13(9): 1537 - 1545. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Lescoute and E. Westhof The interaction networks of structured RNAs Nucleic Acids Res., December 2, 2006; 34(22): 6587 - 6604. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. D. Thompson, S. R. Holbrook, K. Katoh, P. Koehl, D. Moras, E. Westhof, and O. Poch MAO: a Multiple Alignment Ontology for nucleic acid and protein sequences Nucleic Acids Res., July 25, 2005; 33(13): 4164 - 4171. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||




