Skip Navigation


Bioinformatics Advance Access originally published online on October 22, 2007
Bioinformatics 2007 23(23):3256-3257; doi:10.1093/bioinformatics/btm516
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow All Versions of this Article:
23/23/3256    most recent
btm516v2
btm516v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (3)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by López-Pérez, J. L.
Right arrow Articles by Díaz, D.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by López-Pérez, J. L.
Right arrow Articles by Díaz, D.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2007. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oxfordjournals.org

NAPROC-13: a database for the dereplication of natural product mixtures in bioassay-guided protocols

José Luis López-Pérez 1,{dagger}, Roberto Therón 2,*,{dagger}, Esther del Olmo 1 and David Díaz 2

1Departamento de Química Farmacéutica, Facultad de Farmacia, Campus M. Unamuno, 37007 Salamanca and 2Departamento de Informática y Automática, Facultad de Ciencias, Universidad de Salamanca, Spain

*To whom correspondence should be addressed.


    ABSTRACT
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 METHODS
 3 CASE STUDY
 ACKNOWLEDGEMENTS
 REFERENCES
 

Motivation: Although natural products represent a reservoir of molecular diversity, the process of isolating and identifying active compounds is a bottleneck in drug discovery programs. The rapid isolation and identification of the bioactive component(s) of natural product mixtures during the bioassay-guided fractionation have become crucial factors in the competition with chemical compound libraries and combinatorial synthetic efforts. In this respect, the use of spectral databases in identification processes is indispensable.

Results: We have developed a database containing 13C spectral information of over 6000 natural compounds, which allows for fast identifications of known compounds present in the crude extracts and provides insight into the structural elucidation of unknown compounds.

Availability: http://c13.usal.es

Contact: theron{at}usal.es


    1 INTRODUCTION
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 METHODS
 3 CASE STUDY
 ACKNOWLEDGEMENTS
 REFERENCES
 
Natural products have traditionally been a major drug source and continue to play a significant role in today's drug discovery environments (Butler, 2004). In fact, in some therapeutic areas, as for example, oncology, infections and immunomodulation targets, many of the currently available drugs are derived from natural products (Buss et al., 2004).

For drug discovery and their development, natural products represent a reservoir of molecular diversity that may become a complementary resources to combinatorial libraries. Nevertheless, the process of isolating and identifying active compounds is at present a bottleneck in drug discovery programs. However, these practical difficulties can be overcome due to progress made in separation technologies as well as in the speed and sensitivity of structure elucidation (Clarkson et al., 2006). The rapid identification of the bioactive component(s) of natural product mixtures in high-throughput screening programs has become an indispensable factor that guarantees effective competition with chemical compound libraries and combinatorial synthetic methodologies. For the isolation, identification and biological profiling of bioactive compounds, the effective use of automated procedures and databases will be necessary. In order to compare and identify known structures, a query to a database can be performed by spectral data or substructures. 13C NMR spectroscopy is the most powerful tool for this task. A tremendous amount of work can be eliminated by the identification of previously characterized structures with the help of a database search. Instead, our efforts can be reoriented towards the characterization of novel compounds. In the area of natural products, where hundred thousand compounds have been reported in the literature, most compounds are absent from commercially available spectral libraries. SuperNatural is a public resource containing 3D structures developed for searches of bioactive natural compounds (Dunkel et al., 2006). A complementary tool of SuperNatural could be NAPROC-13, since it allows for a rapid identification of natural products in phytochemical studies. Once a compound has been identified from vegetal extracts by means of NAPROC-13, searches by similarity in SuperNatural could be performed in order to determine its hypothetical biological activities. The existing open source database in the web (NMRShiftDB; http://nmrshiftdb.ice.mpg.de) contains spectral information of natural compounds, yet, unfortunately, it also contains a significant quantity of synthetic compounds that lack drug-like properties. NAPROC-13 presents the advantage of containing only natural products and few related compounds. Unlike NMRShiftDB, NAPROC-13 uses Cartesian coordinates for graphics that improve structure representations and allow for a better appreciation of the stereocenters in accordance with IUPAC recommendations (Fig. 1b). Given the stereospecificity of most biological Targets, stereochemistry determines the biological properties of natural products and drugs. Another characteristic of NAPROC-13 is the homogeneity of the numbering system within the same family compounds, which enables the comparison of spectral data lists of a variety of related structures. For every family compounds, NAPROC-13 uses the numbering system of the Dictionary of Natural Products (http://www.chemnetbase.com, Chapmann & Hall/CRC Press). Finally, it contains collections of compounds and their spectroscopic associated data that is not available in other databases.


Figure 1
View larger version (45K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Fig. 1. (a) Iterative search by 4 chemical shifts of the Bonducellpin E 13C NMR Spectrum. (b) Results obtained from the search.

 

    2 METHODS
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 METHODS
 3 CASE STUDY
 ACKNOWLEDGEMENTS
 REFERENCES
 
In order to enhance the drug discovery process of natural products, we have developed NAPROC-13, a suitable tool that deals with complex chemical problems such as structure elucidation, which necessitates the joint efforts of information science and chemistry experts. NAPROC-13 significantly enhances the search of spectroscopic information on the web by integrating structure-based and numerical chemical shift searches. The basic database scheme is relationally organized and the molecular structures are defined and stored in the database with SMILES code (Weininger et al., 1989). This format of structural specification, which uses one line notation, is designed to share chemical structure information over the internet. The sub-structural searches are performed by SMARTS code, a variation of the SMILES code. The spectral 13C NMR data, in the form of a numerical list of chemical shift and carbon multiplicity, is always associated with every compound structure. An applet, JME (Ertl et al., 1997), is used to convert these notations into a graph that represents a structure that will be interpreted by organic chemists. The database contains more than 6000 natural compounds and related compounds, mainly terpenoids (triterpenoids, diterpenoids, etc.). At present, other families as alkaloids are being introduced, too. The largest number of heavy atoms of a compound in the database is 99, and its molecular formula is C66H106O33. More than 2000 compounds contain 30 or more carbon atoms and other 2000 compounds contain 20 or more carbon atoms. Structures and spectral data collected in the database are mainly compiled from papers published in the last issues of the following research journals: Journal of Natural Products, Phytochemistry, Planta Medica, Chemical & Pharmaceutical Bulletin, Chemistry of Natural Compounds, Helvetica Chimica Acta and Magnetic Resonance in Chemistry. NAPROC-13 allows for flexible searches by chemical substructure of structures, by spectral features, chemical shifts and multiplicities. Searches for trivial and semi-systematic names, molecular forms, families, types and groups of compounds according to standard classification of natural compounds are also provided for in a pull-down list system. An important implemented search type enables a search by hot-spots of the molecule looking for chemical shifts of connected carbons, which can be deduced by the interpretation of the 2D NMR experiments like HMQC, HMBC, ROESY, etc.


    3 CASE STUDY
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 METHODS
 3 CASE STUDY
 ACKNOWLEDGEMENTS
 REFERENCES
 
A phytochemical study of Caesalpinia bonduc (L.) Roxb has been described in the current September issue of Journal of Natural Products (Pudhom et al., 2007). This species is used as a medicinal plant in various regions of the tropics. In fact, metabolites of the same family isolated from this plant present antiviral, antimalarial, antibacterial and antioxidant activities. In this article, the isolation of 3 new natural products together with 13 known diterpenoids are described. 13C NMR data of 11 from the 13 known compounds are available in NAPROC-13 (only {alpha}-caesalpin and {xi}-caesalpin are absent). All of them could be easy and rapidly identified as a group of Diterpenoids, called Vouacapanes, by the use of NAPROC-13 database. A later search by type of compound, Vouacapanes, allowed us to find spectroscopical data of 78 compounds and to graphically visualize their chemical shifts over the structure. Had the authors used an iterative search by chemical shifts carried out in NAPROC-13, they could have rapidly elucidated the structures of the three new compounds. Indeed, as can be appreciated in Figure 1a, a search carried out by only 4 chemical shifts of the 23 signals of the spectrum of compound 1, Bonducellpin E, has allowed us to find 6 compounds (Fig. 1b) all of which belong to the Vouacapanes group. The 11 already described compounds belong to the same group. These findings indicate that searches carried out with NAPROC-13 are highly efficient and selective. In addition, the results derived from the analysis of the HMBC correlations published in the same paper could be studied by group searches also implemented in NAPROC-13.


    ACKNOWLEDGEMENTS
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 METHODS
 3 CASE STUDY
 ACKNOWLEDGEMENTS
 REFERENCES
 
Financial support came from the Ministerio de Educación y Ciencia, project TIN2006-06313 and the Junta de Castilla y León, projects SAO30A06 and US21/06. The authors wish to thank the courtesy of Dr Peter Ertl for consenting to the non-profit use of JME.

Conflict of Interest: none declared.


    FOOTNOTES
 
Associate Editor: Jonathan Wren

{dagger}The authors wish it to be known that in their opinion, the first two authors should be regarded as joint First Authors. Back

Received on August 5, 2007; revised on October 2, 2007; accepted on October 8, 2007

    REFERENCES
 TOP
 ABSTRACT
 1 INTRODUCTION
 2 METHODS
 3 CASE STUDY
 ACKNOWLEDGEMENTS
 REFERENCES
 

    Buss DB, Butler MS. A new model for utilising chemical diversity from natural sources. Drug Dev. Res (2004) 62:362–370.[CrossRef][Web of Science]

    Butler MS. The role of natural product chemistry in drug discovery. J. Nat. Prod (2004) 67:2141–2153.[CrossRef][Medline]

    Clarkson C, et al. Discovering new natural products directly from crude extracts by HPLC-SPE-NMR: chinane diterpenes in Harpagophytum procumbens. J. Nat. Prod (2006) 69:527–530.[CrossRef][Medline]

    Dunkel M, et al. SuperNatural: a searchable database of available natural compounds. Nucleic Acids Res (2006) 34:D678–683.[Abstract/Free Full Text]

    Ertl P, Jacob O. WWW-based chemical information system. Theochem (1997) 419:113–120.[CrossRef]

    Pudhom K, et al. Cassane Furanoditerpenoids from the Seed Kernels of Caesalpinia bonduc from Thailand. J. Nat. Prod (2007) 70:1542–1544.[CrossRef][Medline]

    Weininger D, et al. Smiles. 2. Algorithm for generation of unique SMILES. J. Chem. Inf. Comput. Sci (1989) 29:97–101.[CrossRef][Web of Science]


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?



This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow All Versions of this Article:
23/23/3256    most recent
btm516v2
btm516v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (3)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by López-Pérez, J. L.
Right arrow Articles by Díaz, D.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by López-Pérez, J. L.
Right arrow Articles by Díaz, D.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?