Skip Navigation


Bioinformatics Advance Access originally published online on June 9, 2008
Bioinformatics 2008 24(15):1731-1732; doi:10.1093/bioinformatics/btn259
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow Supplementary Data
Right arrow All Versions of this Article:
24/15/1731    most recent
btn259v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Goñi, J. R.
Right arrow Articles by Orozco, M.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Goñi, J. R.
Right arrow Articles by Orozco, M.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2008. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oxfordjournals.org

DNAlive: a tool for the physical analysis of DNA at the genomic scale

J. Ramon Goñi 1,2, Carlos Fenollosa 1,2,3, Alberto Pérez 1,2,3,4, David Torrents 1,2,5 and Modesto Orozco 1,2,3,4,*

1Joint IRB-BSC Program on Computational Biology, Institute of Research in Biomedicine, Parc Científic de Barcelona, Josep Samitier 1-5, Barcelona 08028, 2Barcelona Supercomputing Center, Jordi Girona 31, Barcelona 08034, 3National Institute of Bioinformatics, Parc Científic de Barcelona, Josep Samitier 1-5, 4Departament de Bioquímica, Facultat de Biología, Avgda Diagonal 647, Barcelona 08028 and 5Institut Català per la Recerca i Estudis Avançats (ICREA), Barcelona, Spain

*To whom correspondence should be addressed.


    ABSTRACT
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 IMPLEMENTATION
 ACKNOWLEDGEMENTS
 REFERENCES
 

Summary: DNAlive is a tool for the analysis and graphical display of structural and physical characteristics of genomic DNA. The web server implements a wide repertoire of metrics to derive physical information from DNA sequences with a powerful interface to derive 3D information on large sequences of both naked and protein-bound DNAs. Furthermore, it implements a mesoscopic Metropolis code which allows the inexpensive study of the dynamic properties of chromatin fibers. In addition, our server also surveys other protein and genomic databases allowing the user to combine and explore the physical properties of selected DNA in the context of functional features annotated on those regions.

Availability: http://mmb.pcb.ub.es/DNAlive/ ; http://www.inab.org/

Contact: modesto{at}mmb.pcb.ub.es

Supplementary information: Supplementary data are available at Bioinformatics online.


    1 INTRODUCTION
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 IMPLEMENTATION
 ACKNOWLEDGEMENTS
 REFERENCES
 
Massive genomic projects have revealed the sequence of nearly 50 eukaryotic genomes, including several mammals (among them, humans) and many more will become available in the coming years. So far, the annotation of these genomes has been nearly restricted to the identification and the one-dimensional location of functional features (mostly genes and their regulatory regions), without considering the structural parameters of their environment, which have been proven to be crucial for the functionality of DNA. Determining the structural properties of DNA and the combination of functional features is necessary to interpret and understand the functionality of genomes in a more complex, and therefore real, environment. The identification of these structural parameters allows scientists to consider different levels of accessibility of certain DNA regions to different proteins, such as transcription factors, polymerases and DNA methylases. For example, specific deformability or helical properties in a given region of DNA facilitate or impair the formation of nucleosomes hundreds of base pairs away, or can affect dimerization of two DNA-binding proteins which might be separated by thousands of bases in sequence. Different groups (Abeel et al., 2008; Goñi et al., 2007; Ohler et al., 2001; Pedersen et al., 2000; Singhal et al., 2008) have demonstrated that regulatory regions in DNA display unusual physical properties, and in fact, two groups have recently proven independently (Abeel et al., 2008; Goñi et al., 2007) that eukaryotic promoters can be located with surprisingly good accuracy just analyzing simple physical descriptors of DNA, which confirms the existence of a hidden physical code that controls gene function. In summary, functional annotation needs to be complemented with physical data to understand the structure, dynamics and the general functionality of genomic DNA.

DNAlive has been developed to give a complete description of the physical properties of genomic DNA in a simple way, thus providing data that can be easily understood by non-structural experts. Among others, DNAlive allows the user to (i) determine potential correlations between genome annotations (such as transcription start sites, exons, splicing sites, ...) and a battery of 29 physical descriptors of DNA (stability, helical descriptors, curvature, non-canonical B-DNA affinity, stiffness, ...); (ii) find out the most stable 3D structure of long genome fragments (both naked DNA and DNA-protein complexes) using sequence-dependent average helical parameters, and, when available, experimental structural data on DNA-protein complexes; (iii) perform a dynamic analysis of chromatin fiber exploring the range of deformability sampled during trajectory and the possibility of the formation of transient protein–protein complexes and (iv) display structural parameters of DNA in the context of associated functional features obtained form several public databases. The tool is available as a web page and also as different webservices, which can be incorporated in user workflows (Supplementary Material).


    2 IMPLEMENTATION
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 IMPLEMENTATION
 ACKNOWLEDGEMENTS
 REFERENCES
 
2.1 Entry data
The only mandatory input data for DNAlive is a DNA sequence in FASTA format or the genomic coordinates of a supported vertebrate genome. The program retrieves parameters from their internal databases (Supplementary Table 1) to determine physical profiles and to create a 3D structure of the naked DNA. Given a DNA sequence, the program determines potentially bound transcription factor binding sites (TFBS) by scanning the public TRANSFAC database (http://www.gene-regulation.com/) linked to PDB (http://www.rcsb.org/) and Uniprot databases (http://www.ebi.uniprot.org/). The selection of the complex of interest can be monitored externally by the user, who can force the generation of specific complexes (for example, nucleosomes, protein-multicomplexes, etc.).

2.2 Server workflow
Once a DNA sequence is entered (Fig. 1), the program computes the profile for the 29 physical properties available for the fiber (Supplementary Table 1). All properties are represented in a 2D plot using either the UCSC Genome Browser (http://genome.ucsc.edu) in combination with annotated genes whenever genomic coordinates for the genome are provided, or Gnuplot (Fig. 1 and Supplementary Fig. 1).


Figure 1
View larger version (54K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Fig. 1. centering DNAlive web server workflow diagram.

 
To combine the visualization of DNA physical properties with public annotations of the genome, coordinates of the input DNA sequence can be matched by running a search in our local Blat server (Kent, 2002). Although the user is able to annotate transcription factor PDB structures on specific positions of the DNA input sequence, we have implemented an automatic method to perform this step using the TFBS Perl library (Lenhard and Wasserman, 2002). The reconstruction of the average 3D structure of DNA is achieved using sequence-dependent base step parameters derived from accurate atomistic molecular dynamics (Pérez, 2007) and making use of a local adaptation of X3DNA (Lu and Olson, 2003) script (Fig. 1 and Supplementary Fig. 2). When structural information on protein–DNA complexes is available, modeled structures in the corresponding segment are substituted by the experimental geometries, and junctions are refined if required. The visualization of 3D structures is performed by integrating Jmol Java applets (http://www.jmol.org/) in the HTML page. All physical descriptors can be mapped into the 3D structure to favor the detection of potential correlations between conformation, functional annotations and physico-chemical properties (Fig. 1).

The server also includes unique tools for a rapid representation of chromatin dynamics, which, in extensive analysis performed in our laboratory on our database of more than 100 trajectories, showed a surprisingly high accuracy of the essential deformation pattern of DNA. The method uses a mesoscopic Metropolis Monte Carlo algorithm, where the geometry of each base pair is defined by three local rotations (roll, tilt and twist) and translations (slide, shift and rise), and the conformational energy is estimated from the deformation matrix using a harmonic model (Equation 1), where the index ‘i’ stands for one of the M base pair steps and the index ‘j’ stands for the six unique helical parameters ({xi}) for each step. The equilibrium values for one helical parameter in a given base pair step type and ({xi}Formula) and the associated deformation constant (Ki,j) were previously determined from molecular dynamics simulations (Pérez, 2007). Once a movement in helical coordinates is accepted by the Metropolis test, the corresponding Cartesian structure of the fiber is generated using an adaptation of X3DNA (Lu and Olson, 2003) for VIDEO visualization using JMOL Java applets in the HTML page (Supplementary Fig. 3). Basic manipulation and analysis of the trajectories and structure (rotations, translations, distance measurements,...) are allowed by the Jmol interface, which allows the determination of potential DNA-mediated protein-clusters.


Formula 1

(1)


    ACKNOWLEDGEMENTS
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 IMPLEMENTATION
 ACKNOWLEDGEMENTS
 REFERENCES
 
We thank the help of Agnes Noy, David Piedra, Henrique Proenc and Joaquín Panadero as β-testers of the server.

Funding: This work has been supported by the Spanish Ministry of Education and Science (BIO2006-01602 and BIO2006-15036), the Spanish Ministry of Health (COMBIOMED network), the Fundación Marcelino Botín and the National Institute of Bioinformatics (Structural Bioinformatics Node).

Conflict of Interest: none declared.


    FOOTNOTES
 
Associate Editor: Alfonso Valencia

Received on March 27, 2008; revised on May 16, 2008; accepted on June 4, 2008

    REFERENCES
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 IMPLEMENTATION
 ACKNOWLEDGEMENTS
 REFERENCES
 

    Abeel T, et al. Generic eukaryotic core promoter prediction using structural features of DNA. Genome Res (2008) 18:310–323.[Abstract/Free Full Text]

    Goñi JR, et al. Determining promoter location based on DNA structure first-principles calculations. Genome Biol (2007) 8:R263.[CrossRef][Medline]

    Kent WJ. BLAT- the BLAST-like alignment tool. Genome Res (2002) 12:656–664.[Abstract/Free Full Text]

    Lenhard B, Wasserman WW. TFBS: computational framework for transcription factor binding site analysis. Bioinformatics (2002) 18:1135–1136.[Abstract/Free Full Text]

    Lu XJ, Olson WK. 3DNA: a software package for the analysis, rebuilding and visualization of three-dimensional nucleic acid structures. Nucleic Acids Res (2003) 31:5108–5121.[Abstract/Free Full Text]

    Ohler U, et al. Joint modeling of DNA sequence and physical properties to improve eukaryotic promoter recognition. Bioinformatics (2001) 17(Suppl. 1):S199–S206.[Abstract]

    Pedersen AG, et al. A DNA structural atlas for Escherichia coli. J. Mol. Biol (2000) 299:907–930.[CrossRef][Web of Science][Medline]

    Pérez A, et al. Refinement of the AMBER force field for nucleic acids. Improving the description of {alpha}/{gamma} conformers. Biophys. J (2007) 92:3817–3829.[CrossRef][Web of Science][Medline]

    Singhal P, et al. Prokaryotic gene finding based on physicochemical characteristics of codons calculated from molecular dynamics simulations. Biophys. J (2008) [EPub ahead of print; DOI:10.1529/biophysj.107.116392].


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?



This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow Supplementary Data
Right arrow All Versions of this Article:
24/15/1731    most recent
btn259v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Goñi, J. R.
Right arrow Articles by Orozco, M.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Goñi, J. R.
Right arrow Articles by Orozco, M.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?