Skip Navigation


Bioinformatics Advance Access originally published online on November 5, 2004
Bioinformatics 2005 21(7):1246-1256; doi:10.1093/bioinformatics/bti137
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow All Versions of this Article:
21/7/1246    most recent
bti137v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (1)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Chabalier, J.
Right arrow Articles by Fichant, G.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Chabalier, J.
Right arrow Articles by Fichant, G.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2004. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions{at}oupjournals.org

ISYMOD: a knowledge warehouse for the identification, assembly and analysis of bacterial integrated systems

Julie Chabalier 1,2, Cécile Capponi 1,2, Yves Quentin 2 and Gwennaele Fichant 3

1Laboratoire d'Informatique Fondamentale de Marseille 39 rue Joliot Curie 13453 Marseille cedex 13, France
2Laboratoire de Chimie Bactérienne, CNRS 31 Chemin Joseph Aiguier, 13402 Marseille cedex 20, France
3Laboratoire de Microbiologie et Génétique Moléculaires, UMR 5100 CNRS-Université Paul Sabatier Toulouse Cedex, France

*To whom correspondence should be addressed.


    Abstract
 TOP
 Abstract
 1 INTRODUCTION
 2 BIOLOGICAL INTEGRATED SYSTEMS
 3 KNOWLEDGE REPRESENTATION IN...
 4 ISYMOD SCHEME
 5 OPERATIONAL ISYMOD
 6 DISCUSSION
 REFERENCES
 

Motivation: Complex biological functions emerge from interactions between proteins in stable supra-molecular assemblies and/or through transitory contacts. Most of the time protein partners of the assemblies are composed of one or several domains which exhibit different biochemical functions. Thus the study of cellular process requires the identification of different functional units and their integration in an interaction network; such complexes are referred to as integrated systems. In order to exploit with optimum efficiency the increased release of data, automated bioinformatics strategies are needed to identify, reconstruct and model such systems. For that purpose, we have developed a knowledge warehouse dedicated to the representation and acquisition of bacterial integrated systems involved in the exchange of the bacterial cell with its environment.

Results: ISYMOD is a knowledge warehouse that consistently integrates in the same environment the data and the methods used for their acquisition. This is achieved through the construction of (1) a domain knowledge base (DKB) devoted to the storage of the knowledge about the systems, their functional specificities, their partners and how they are related and (2) a methodological knowledge base (MKB) which depicts the task layout used to identify and reconstruct functional integrated systems. Instantiation of the DKB is obtained by solving the tasks of the MKB, whereas some tasks need instances of the DKB to be solved. AROM, an object-based knowledge representation system, has been used to design the DKB, and its task manager, AROMTASKS, for developing the MKB. In this study two integrated systems, ABC transporters and two component systems, both involved in adaptation processes of a bacterial cell to its biotope, have been used to evaluate the feasibility of the approach.

Contact: julie.chabalier{at}ibsm.cnrs-mrs.fr


    1 INTRODUCTION
 TOP
 Abstract
 1 INTRODUCTION
 2 BIOLOGICAL INTEGRATED SYSTEMS
 3 KNOWLEDGE REPRESENTATION IN...
 4 ISYMOD SCHEME
 5 OPERATIONAL ISYMOD
 6 DISCUSSION
 REFERENCES
 
With the increasing amount of available genomic data it has become clear that computational approaches are needed to supervise the methods used to exploit them.

Different approaches have been developed in order to address the various problems of safely storing, sharing and exploiting biological data. Ontologies, such as Gene Ontology (Gene Ontology Consortium, 2004) or MetaCyc (Karp et al., 2004; Krieger et al., 2004) have been developed for setting up common terminologies, hence facilitating the sharing and the mapping of biological annotations. Some ontologies are the backbone of knowledge bases and databases. Besides, numerous knowledge bases and data warehouses have been developed, each usually focusing on a specific biological domain, such as RiboWeb (Altman et al., 1999), dedicated to the modeling of the supra-molecular organization of the ribosome, or GemCore (Bronner et al., 2002) devoted to knowledge representation in comparative genomics. Meanwhile, some recent projects aim at structuring and organizing bioinformatics resources (data and programs) distributed over the web (Stevens et al., 2003). Finally, many approaches deal with data warehouses: the data is locally imported from a distant server, then local methods are applied to predict new data from available ones (Perrière et al., 2000).

These various approaches usually require the modeling, and the parsable—consistent—representation, of the numerous underlying objects and the way they interact (biological concepts, bioinformatics methods, problems to be solved, etc.). Once the consistency of the modeling is ensured, automatic computational methods can safely manipulate the described data. Usually, such a consistency is ensured by a knowledge representation system (KRS) built over a knowledge representation language. However, in the case of data warehouses, the consistency cannot always be automatically ensured: as programs are executed daily to produce new data, it gets more and more difficult to guarantee the correctness of the whole database because the host representation system is usually not able to control all the stages of the computational processes.

We present here the development of an operational environment—named Integrated SYstem MODeling (ISYMOD)—which is a type of data warehouse whose originality is to be built over a KRS ensuring the consistency of computed and represented data: one could name it a knowledge warehouse (Nemati et al., 2002). It is built over AROM, a KRS which features two connected representation languages: one for representing the knowledge of the domain, and the other for representing the strategies used to infer and store new information from available data. Hence, data and analyzing methods are merged within the same—local—environment, which facilitates their interaction: the former are produced by the latter, while the latter are tuned by the former. AROM controls the consistency of the whole. More precisely, ISYMOD is a knowledge warehouse (KW) which integrates, in the same environment, a domain knowledge base (DKB) and a methodological knowledge base (MKB). Both knowledge bases are extensible, which means that new concepts and new methods can be added at any moment, either to complete the knowledge about integrated systems, or to tune some new strategies and algorithms for identifying them. The current version of ISYMOD is devoted to the representation and acquisition of bacterial integrated systems that are involved in the exchanges between the bacterial cell and its environment. These systems are particularly interesting since they are a driving force in evolution, allowing the adaptation of the bacteria to a large spectra of biotopes, some of them having environmental, industrial or public health implications. The first systems to be instantiated were ABC transporters and two component systems (TCS). They can share functional links as demonstrated by experimental studies (Joseph et al., 2002 and references therein).

The paper is organized as follows. First, the main characteristics of biological integrated systems are introduced, together with a strategy to target them on a new genome. Then the KRS AROM is presented, which features a object-based kernel and a task manager. We can then describe ISYMOD, which models some integrated systems as well as the strategies to identify and reconstruct them, and eventually stores experimental and predicted data. Before a discussion of that work among others, an operational view of ISYMOD is briefly given.


    2 BIOLOGICAL INTEGRATED SYSTEMS
 TOP
 Abstract
 1 INTRODUCTION
 2 BIOLOGICAL INTEGRATED SYSTEMS
 3 KNOWLEDGE REPRESENTATION IN...
 4 ISYMOD SCHEME
 5 OPERATIONAL ISYMOD
 6 DISCUSSION
 REFERENCES
 
2.1 Definitions
Complex biological functions emerge from interactions between proteins in stable supra-molecular assemblies and/or throughout transitory contacts. Such emerging functions are most of the time different from the function of the individual proteins involved in the interactions. Thus, if we want to get a systemic view of the organism under study and understand the evolution of biological processes, the question of the prediction and the assembling of individual proteins in such complexes, referred to as integrated systems, should be addressed.

The integrated systems that are involved in the exchanges between the bacterial cell and its environment are particularly interesting for biological reasons and also for illustrating computational challenges we are faced with. Such systems are important for the adaptation of the bacteria to its media and the genomic comparative analysis of their repertories should help to understand the molecular mechanisms that are involved in the adaptation processes of bacterial genomes. Two types of integrated systems have been analyzed: ABC transporters and TCS.

The ABC transporters, or traffic ATPases, turn up in the three major life kingdoms (Prokaryota, Archea and Eukaryota) and are involved in many physiological processes. Most ABC transporters mediate the active uptake or efflux of specific molecules across biological membranes, handling a wide variety of compounds that differ in nature and size (oligosaccharides, amino acids, peptides, antibiotics, metallic cations, etc.) (reviewed by Holland and Blight, 1999; Higgins, 2001). They are encoded by large families of paralogous genes and can be arranged in a comprehensive classification well correlated with specificity of transport for the substrate (Linton and Higgins, 1998; Paulsen et al., 1998; Taglicht and Michaelis, 1998; Tomii and Kanehisa, 1998; Quentin et al., 1999; Saurin et al., 1999; Braibant et al., 2000; Dassa and Bouiges, 2001). A typical ABC transporter—either exporter or importer—consists of two membrane-spanning domains (MSDs) and two nucleotide-binding domains (NBDs) (Fig. 1). The import systems are associated with a solute-binding protein (SBP). In bacteria, different genes generally encode the different domains, and in newly sequenced genomes only genes encoding NBDs are correctly annotated. Indeed, among these domains, only the NBDs exhibit much sequence conservation. The MSDs and SBPs in contrast show only fuzzy global sequence conservation.



View larger version (20K):
[in this window]
[in a new window]
 
Fig. 1 Typical ABC import system. Five proteins interact to import the substrate into the cell. The MSDs constitute the membrane channel and the NBDs energize the transport through ATP hydrolysis. The SBP confer specificity for compounds to the transporter. In the case of the export system, the SBP is absent.

 
The TCS are regulatory systems involved in the detection and the transduction of specific signals that trigger the bacteria to an adaptative process (Parkinson and Kofoid, 1992). These systems are usually composed of a sensor kinase that is able to detect one or several environmental stimuli and which phosphorylates a response regulator, which in turn activates expression of genes necessary for the appropriate physiological response. These two partners contain two domains (1) an input domain and a transmitter domain for the sensor kinase and (2) a receiver domain and an output domain for the response regulator. The sensor input domain varies in amino acid sequence, conferring specificity for different stimuli. The output domain of the response regulator can be classified into different subfamilies (Mizuno, 1997; Fabret et al., 1999; Beier and Frank, 2000; Throup et al., 2000; Rodrigue et al., 2000). However, more complex structure can be found and some systems use extra isolated phospho-relay domains (HPT domains) that are encoded by separated genes.

Both systems share common characteristics: (1) they are composed of different domains encoded, in bacteria, by separated genes, some of those partner proteins having only fuzzy sequence conservation, (2) from the computational point of view, they are encoded by two families of paralogous genes which are among the most numerous in bacterial genomes, and show the same identification and reconstruction problems and (3) many experimental studies have shown that they are related at the functional level allowing an extension of the modeling to a higher level (Joseph et al., 2002 and references therein). Indeed, upon stimulus detection, the response regulator is activated through a cascade of phosphorylations and activates in turn the expression of the genes encoding the different partners of the ABC transporter.

2.2 Identification, reconstruction and storage
In order to establish in a complete genome the repertory of a given integrated system, we have to go farther than the first level of genome annotation (gene and functional predictions). This higher level of annotation includes the following steps: (1) identifying the different partners using different bioinformatics methods according to their sequence properties (sequence conservation levels and structural characteristics), (2) reconstructing the systems using assembly rules and (3) classifying the system into the correct functional subfamily. Information on the interaction pathway is not directly accessible from the analysis of the complete sequence. However, this knowledge can be inferred either by the analysis of the genomic localization of the genes encoding such systems or throughout phylogenetic inferences drawn from multiple genomic comparisons. Indeed, genes functionally related are often found in the same chromosomal neighborhood and when they are dispersed along the chromosome, homology relationships may help to reassemble partners (Quentin et al., 2002). Therefore, complex computational approaches are needed to handle the analysis and modeling of integrated systems and the analyzing methods have to be combined in specific ways referred to as bioinformatics strategies.

As a first step in this direction, we have implemented a general automated strategy (Quentin et al., 2002). Its validation has been done first on the ABC transporters and then extended to the two component systems. It relies upon a learning step for computing the parameters of the methods involved in the identification process. These parameters are of two types, motifs and profiles, since the different bioinformatics methods used are based either on similarity searches (Blast, PsiBlast, Hmmer), or motif identification (MetaMeme, Mast). They were first derived from a set of ABC transporters and two component systems, we had annotated by analyzing the set of proteins encoded by 20 bacterial genomes. For analyzing a new incoming genome, the different bioinformatics methods are launched simultaneously on the encoded protein sequences and their results are combined. The next step is the validation of the prediction. The methods were used with a high sensitivity level in order to maximize the number of true positives. The drawback resides in a high number of false positives. In order to reduce this, we applied the BlastP2 program as follows: each predicted protein was used as query against a dataset composed of all protein sequences encoded by previous processed genomes. Queries for which the first hits did not belong to the integrated system partner sub-families are considered as false positives. We refer to this checking procedure as BackBlast. Therefore, the remaining proteins are considered as validated systems partners and are classified into functional sub-families, divided into functional domains and assembled into a biological integrated system using two rules: (1) the genes encoding the different system partners are closely located on the chromosome and (2) the partners belong to compatible sub-families (Quentin et al., 1999). The output data have been stored in a specialized database, ABCdb (Quentin and Fichant, 2000).

The strategies developed uses rules and parameters updated with the incoming data. The analysis mechanism stores the data obtained, then reuses this to reevaluate the method's parameters and thereby launches more accurate analysis of new data sets. Thus predictions made in initial states can be updated in the upcoming runs with the possibility of incorrectly propagating the modifications in the complex network of dependencies. This pitfall can be avoided if the dataflow between the database and the analysis mechanism is controlled all along the strategies. This control can be fully satisfied only if the data and the bioinformatics methods, which are involved in the strategies, are formally defined. In addition, if we can easily store the individual proteins involved in an integrated system in a database, representing complex relationships among them requires more sophisticated data-processing and semantic tools.

AROM, an object-based knowledge representation system (OBKRS), fits these objectives in several ways: (1) It allows a formal and explicit representation of the integrated system's objects and their relationships as classes and n-ary instantiable associations, (2) its internal classification mechanism, added with a propagation algorithm, ensures automatic management of the evolving knowledge through the recursive process of identifying partners and reconstructing assemblies, (3) the current integration of a task manager, AROMTASKS, provides a declarative way to represent and store bioinformatics strategies and (4) the formal modeling of biological knowledge and sequence analysis tools in the same environment ensures an automatic control of the dataflow.

The modeling and storage of the knowledge is achieved through the development of a DKB and the modeling of the methods through the development of an MKB, both being tightly connected.


    3 KNOWLEDGE REPRESENTATION IN AROM
 TOP
 Abstract
 1 INTRODUCTION
 2 BIOLOGICAL INTEGRATED SYSTEMS
 3 KNOWLEDGE REPRESENTATION IN...
 4 ISYMOD SCHEME
 5 OPERATIONAL ISYMOD
 6 DISCUSSION
 REFERENCES
 
3.1 AROM's kernel and classification
In AROM as well as in other OBKRSs, a concept denotes a set of objects with common properties. Therefore, biological entities are modeled by classes, while their properties are named and specified as typed variables.

Let us take ABC transporters as an illustration of an integrated system (Fig. 1). Three biological entities emerge: domain, protein and assembly. The domain (NBD, MSD or SBP) carries the biochemical function and is a part of a protein sequence. The protein is the physical entity composed of one or many domains. The assembly corresponds to the biological reconstructed system. Its structure results from protein interactions and its function is issued from the combination of domains. These three biological entities can be represented by three classes (Fig. 2a). Each class is described by a set of variables corresponding to object features such as the number of proteins involved in an assembly (variable partnersNb of the class Assembly), the organization of the domains found in a protein (variable StructuralOrganization of the class Protein), or the domain's type (variable type of the class Domain) (Fig. 2a).



View larger version (52K):
[in this window]
[in a new window]
 
Fig. 2 (a) UML-like graphical representation of integrated systems. The three principal biological entities involved in the integrated systems are represented by three root classes depicted by dark gray rectangles: Assembly, Domain and Protein. These classes described by typed variables, are linked by the ternary association HasDomain (round corner rectangle), described by typed variables. Roles and multiplicities of the association are specified on the black lines. The Assembly class is specialized into the ABC_Assembly sub-class (white rectangle) that represents the ABC transporters. The relation of specialization is depicted by an arrow from sub-class toward the root class. (b) Illustration of instantiation of the domain knowledge base in AROM's language. The protein BsubA01_OPUBA is an instance of the class Protein. Its function is involved in choline uptake (variable identification) and it carries a domain NBD (variable structuralOrganization). It is linked to the instance BSUBA01_OPUBA of the class ABC_Assembly and to the instance BsubA01_OPUBA_N1 of the class Domain throughout the tuple of the ternary association HasDomain. This association is described by three attributes: beginPosition and endPosition giving the domain positions on the protein sequence, and domainFamily corresponding to the functional classification of the domain. Assembly, protein and domain correspond to the three roles whose instantiation creates the link between the three objects BsubA01_OPUBA, BSUBA01_OPUBA and BsubA01_OPUBA_N1 belonging to one of the three classes connected throughout the association. peptideLength and testSBP are attributes whose values are computed using an inference descriptor: for example, the length of the peptide is computed from its position on the chromosome, which is retrieved through the association BelongsTo (data not shown, Fig. 4).

 
The relationships between n classes (n ≥ 2) are represented, following the UML terminology, by n-ary associations throughout roles specified by a name and typed by the class involved in the relationships. Named and typed variables can then complete the description of an association. In Fig. 2a, the three classes Domain, Protein and Assembly are linked through the ternary association HasDomain which is described by three variables: begin, end and domainFamily that store the first and last positions of the domain on the protein, and its sub-family.

Objects are instances of a class and tuples are instances of associations. A tuple of an n-ary association is the (n+p)-tuple made of a link involving n partners (objects) and described by p variables. Therefore, a link represents an existing relationship between objects each playing one or more identified roles in the link. Examples of objects and tuples in AROM's language are given in Fig. 2b. They correspond to the instantiation of the three classes and the association is described in Fig. 2a.

Once typed, a variable can be further characterized using different kinds of descriptors. For example, domain descriptors restrict the domain of values that one can assign to the considered variable. Inference descriptors are either programs, or algebraic equations written using an algebraic modeling language (AML), that allows the expression of equations involving objects and tuples of the KB through a formalism resembling mathematical notation. An inference descriptor is attached to a variable for computing its value for a given instance. Such an inference may involve not only other variables of the class, but also variables of another class associated with the former through an association.

Then, classes and associations provide necessary conditions for membership: an instance can belong to a class (an association) if each variable of the class (association) is assigned a correct value according to the model.

With AROM, classes are organized within a tree-like partial order named specialization, whose set-based semantics is close to that of subsumption. A class inherits all variables (including descriptors) of its super-class; class specialization can also involve the addition of new variables, the restriction of the domain of an inherited variable, and the specification of a new inference method to compute the value of an inherited variable. Associations can be specialized in the same way, although the arity of an association never changes. The ISYMOD DKB developed in AROM language is described in the paragraph presenting the DKB (Section 4.1).

AROM promotes an internal classification mechanism (Capponi and Gensel, 2000) that, given an object and a tree of classes, finds the most specialized class the object can belong to according to variable specifications. Classification also runs over tuples of the association hierarchy. In addition, we recently added an algorithm that recursively propagates the classification to all related objects. This algorithm is based on the association properties of AROM which draw controlled paths among objects (Chabalier et al. 2003). More details on AROM can be found in Page et al. (2001) and Capponi et al. (2001).

3.2 AROMTASKS: the task manager of AROM
The methodological knowledge is the knowledge that produces and exploits the knowledge of the domain under study. The goal of developing an MKB consists of structuring the methodological knowledge in order to select the most appropriate methods for solving one previously identified and described problem within an evolving context represented by the DKB. It allows the user to select and/or control the executable process.

AROM's task manager (AROMTASKS) provides a language for describing problems, and an execution controller which manages their resolution. In order to construct an MKB, we have to: (1) identify and describe the different problems encountered in a given domain (in our case the identification and reconstruction of biological integrated systems), (2) define the solving method(s) associated with each problem and (3) associate problems and solving methods through a solving strategy.

3.2.1 Describing problems
In AROMTASKS, a class of problems allows a parsable description of a set of similar problems that can be encountered in a specific domain modeled into the connected DKB (Parmentier and Ziébelin, 1999). Therefore, a class of problem (named a problem) is described by a list of inputs and outputs which can be typed by classes and/or associations of the DKB. This list may be completed with a textual description of the problem. Problems are organized through two kinds of relationships: an is-a relationship and a part-of relationship.

  • The is-a relationship corresponds to the specialization of problems. Thus, a sub-problem is a particular case of the more general problem, also called super-problem; it inherits all inputs and outputs of its super-problem and can also be described by new specific inputs. The problem specialization facilitates the description of any new problem that has already been partially described in a more general context. At execution time, when a problem must be solved, a more specific problem is eventually selected and solved according to the state of the DKB and the availability of the inputs of the problem.
  • More complex problems can be encountered, which can be solved only by the combination of other ones. The part-of relationship allows to subdivide a problem, named composite, into several other problems, called components. Therefore, all along a composition hierarchy, a problem can be either a composite or a component depending on the subdividing level. The inputs/outputs of a composite problem are included in the set of the inputs/outputs of all its component problems. When the resolution is launched, additional arguments, named temporary input/output, can be involved in order to ensure the dataflow between the different component problems.
An elementary problem is a problem that can be neither further subdivided nor specialized. Usually, one or more executable methods are associated with each elementary problem. An example of a problem description in AROMTASKS is given in Fig. 3. The consistency of the problems decompositions along these two kinds of hierarchies is checked by AROMTASKS with regards to its underlying language semantics. Moreover, at execution time, the dataflow is automatically managed by a module named the execution controller which dynamically selects the most appropriate problem to be solved at any breakpoint, and which integrates its results in a consistent way. If the result of a problem is a modification of the DKB, the consistency checking is passed to AROM's kernel.



View larger version (42K):
[in this window]
[in a new window]
 
Fig. 3 Textual extract of an MKB. Description of the PredByPsiBlast problem, one of its corresponding solving method, and a strategy.

 
3.2.2 Programming methods to solve problems
Each problem (e.g. sequence alignment) may be associated with several algorithms (e.g. BlastP, Fasta, SmithWaterman, etc.): then each algorithm is described in AROMTASKS by a solving executable method. Thus, a method describes the way a class of problems can be solved. It also associates a set of inputs to a set of outputs. However, unlike a problem, a method contains an executable part which specifies how the outputs are computed from the inputs. The executable part contains instructions written in a JAVA language interpreter (Bean-Shell) (see Fig. 3 for an example of method’s definition). In order to reuse the methods through different applications and to avoid an overload of the MKB, each method definition relies on external program library written in JAVA.

3.2.3 Building and describing a strategy
Finally, a strategy depicts the association of a problem to its solving method(s) (Fig. 3). It is described by the problem name, the set of solving methods and a set of criteria guiding one choice among the different possible methods. A method can be associated with a problem in a strategy if a strict (a one by one) mapping can be established between the method inputs and outputs and the problem inputs and outputs. At execution time, the right method for solving a problem is selected according to the state and class membership of the actual input provided at that time. The order in which the problems have to be solved depends on the inputs and outputs of the component problems and constitutes the dataflow. The ISYMOD bioinformatics strategy modeled with AROMTASKS is fully explained in the MKB paragraph (Section 4.2).


    4 ISYMOD SCHEME
 TOP
 Abstract
 1 INTRODUCTION
 2 BIOLOGICAL INTEGRATED SYSTEMS
 3 KNOWLEDGE REPRESENTATION IN...
 4 ISYMOD SCHEME
 5 OPERATIONAL ISYMOD
 6 DISCUSSION
 REFERENCES
 
Ideally the DKB and MKB should be presented in parallel since they are deeply inter connected. However, for the clarity, we will first present the DKB which models the biological knowledge and next the MKB used to model the methods employed to infer this knowledge from the raw data. We will show how the outputs of the methods launched in the MKB are used to feed the classes and associations of the DKB and how the dataflow is handled in the MKB.

4.1 Schema of the domain knowledge base
Currently ISYMOD models two types of integrated systems—ABC transporters and TCSs—as well as the results of the methods used for their prediction. The different concepts identified for the modeling of this specific biological domain and the relationships they entertain are represented by root classes and root associations. Properties of concepts and relationships are represented by variables of the corresponding class and association, respectively.

The complete schema of the DKB (Fig. 4) includes 12 root classes, some of which are specialized into more characterized sub-classes. The root classes are linked through 13 root associations, which are specialized in turn along with the class hierarchies. The schema can be subdivided into three distinct parts, which are connected through the root class Protein. The first part contains general information relative to the primary source of data such as the chromosome, the strain, the organism and the references, which are retrieved from EMBL/GenBank files (see next section). The second part is dedicated to the storage of the prediction results, in order to keep track of the analyses performed on each protein. For that purpose, a class Prediction, connected to the class Protein, is specialized according to the type of the approaches used, that are based either on similarity searches (SimilarityPred Subclass) or motif identification (MotifPred Subclass). The deepest classes in the hierarchy correspond to the bioinformatics methods. Another class, CheckingOfPrediction, encloses the result of a method applied to identify the false positives. This knowledge is obtained by solving the problem of partner's identification modeled in the MKB; it provides the necessary knowledge to instantiate the third part of the DKB that handles the main concepts and covers the knowledge representation of the integrated systems. This last part is centered on the three major biological entities corresponding to the three root classes, Protein, Domain and Assembly linked by the root association HasDomain, as explained previously (Fig. 2). The functional classification is depicted by the class Subfamily, while the association IsMemberOf connects the integrated system to its current subfamily. If the top classes can model any systems, the class and association hierarchies are system-dependent, in order to express more precisely the specific features relative to each system. As an example, the class Assembly is specialized into ABC_Assembly and TCS_Assembly, the former being again specialized into ImportABC and ExportABC according to the type of transport. Some classes are specific to one type of integrated system. Namely, the class Stimulus is only related to TCS_Assembly, while Compound only concerns ABC_Assembly.



View larger version (40K):
[in this window]
[in a new window]
 
Fig. 4 UML-like graphical representation of the DKB. The notations are the same used in Fig. 3. The modeling of the DKB can be subdivided into three parts connected by the Protein class. The first part, related to the primary source of data, is depicted at the bottom left of the schema. The second part (bottom right) represents the prediction results according to the method used. The core part of the DKB is represented on the top of the schema. Root classes and associations are represented by shaded boxes.

 
4.2 Schema of the methodological knowledge base
The bioinformatics approach described in Section 2.2 is declaratively modeled as a problem layout using AROMTASKS. The complete description of the modeling would be too tedious and only the most critical aspects are developed. The whole bioinformatics strategy is available in Quentin et al. (2002). The identification and reconstruction of integrated systems can be organized and solved in four major steps: primary data retrieval, partner identification, system reconstruction and learning update. It has been modeled through a composite problem AnalysisOfIntegratedSystem decomposed into four component problems: DataRetrieval, PartnerIdentification, SystemReconstruction and Learning (Fig. 5).
  • Primary data retrieval. The primary sources of data are the EMBL or GenBank files provided by the authors of the genome annotation. When a new sequenced genome is available in the databanks (EMBL or GenBank), it is automatically retrieved and parsed to extract and format the required data. Therefore, two component problems must be solved: the FileRetrieval and the FileFormatting. The inputs of the FileRetrieval problem are the URL of the genome repository and the list of already processed genomes, in order to retrieve only the new genome files. For each new genome, this file is the input of the problem FileFormatting, which extracts: (1) the protein sequences annotated in the selected genome and (2) the information contain in the EMBL/GenBank file concerning the organism, strain, chromosome, etc. Its outputs are a protein sequence file in Fasta format, and instances of the classes Organism, Strain, Chms, and References as well as instances of the associations linking these classes. The Protein class is also partially instantiated. Indeed, this first step of the solving execution extracts all the proteins encoded by a genome without knowing if they belong to an integrated system. Therefore, for each instance of the class Protein, only the name, description and orientation variables are valued.
  • Partner identification. The identification of all the system's partners is the major problem encountered. It is solved by the problem PartnerIdentification where the input is the protein sequence file created as output of the DataRetrieval problem, and outputs are instances of the major root classes and associations of the DKB (Protein, Domain, HasDomain), as well as instances of classes and associations dedicated to the prediction storage.
This high level composite problem can be subdivided into two major component problems: the identification of the proteins involved in an integrated system (ProteinIdentification) and the identification of the domains themselves (DomainIdentification), including bound prediction (DomainPrediction) and feature annotation such as the number and location of transmembrane fragments (FeatureAnnotation).



View larger version (19K):
[in this window]
[in a new window]
 
Fig. 5 General modeling of the MKB. The different problems to be solved are organized either throughout a part-of relationship (downward arrows) or an is-a relationship (upward arrows). The problems represented in light gray correspond to the elementary problems. They are solved by the launching of an appropriate executable method that could encapsulate bioinformatics programs.

 
The problem ProteinIdentification requires the solving of two other component problems: Prediction, which identifies proteins potentially involved in the integrated systems, and PredictionChecking, which confirms or rejects the prediction. The inputs of the problem Prediction are the protein sequence files and the parameters required for applying the corresponding bioinformatics program (e.g. motif, profiles, etc.). Its outputs are objects of the Prediction class and tuples of the IsPredictedBy association that linked each protein to its prediction result. As the identification of the protein partner of a system can be achieved by different bioinformatics programs, either based on similarity searches or motifs detection, the Prediction problem is specialized into a hierarchy of sub-problems according to the properties of the associated algorithms. The selection of the appropriate sub-problem involves a processing that is performed on the dynamic type of the input data. For example, if the input file of the problem Prediction actually contains profiles, then the sub-problem PredByProfile is selected. Processing down the tree of problems, if the profiles are PsiBlast profiles, then the sub-problem PredByPsiBlast is selected. This latter being an elementary problem, it is solved by launching the UtilPsiBlast method, which performed the program PsiBlastP on the sequences from the file given as input (see Fig. 3 for the details). The outputs of each sub-problem are instances of the corresponding sub-class of the DKB Prediction class and tuples of the association IsPredictedBy. Then, all the previously stored proteins which are not predicted as involved in an integrated system are deleted from the DKB.

The next step consists in the validation of the prediction by solving the problem PredictionChecking. It takes as input the identified proteins, which are retrieved by querying the DKB, and provides, as outputs, instances of the CheckingofPrediction class. In addition, the variable computedStatus of the class Protein receives the value ‘confirmed’, if the prediction is validated, or ‘rejected’ if the protein appears to be a false positive. Presently, only one method has been implemented, based on the BackBlast procedure explained in Section 2.2.

In order to complete the solving of the PartnerIdentification problem, the annotation of protein domains should be performed. The DomainIdentification problem allows prediction of: (1) the boundaries of the domain on the protein sequence, (2) the functional type of the domain (e.g. MSD, NBD or SBP in the case of ABC transporter, see Fig. 1) and (3) its sub-family membership. These data are obtained by analyzing the results of the PsiBlast prediction. Therefore, the input of the sub-problem DomainIdentification is the set of instances of the sub-class PsiBlastPred, that are linked to protein instances when the value of the variable computedStatus is ‘confirmed’. Those instances are obtained by querying the DKB. The outputs are instances of the class Domain and the association HasDomain.

Once the problem PartnerIdentification is solved, the sub-classes of Prediction are instantiated as well as the root classes Protein, Domain and CheckingOfPrediction. As the integrated systems have not yet been reconstructed, the tuples of the HasDomain association lack the link with the Assembly class. They will be completed during the solving of the SystemReconstruction problem.

  • System reconstruction. The assembly of the validated partners in a functional biological integrated system is performed by the resolution of the SystemReconstruction problem which is specialized, in this present version, in the unique SystemAssemblying sub-problem. The reconstruction relies upon two rules: (1) the close localization on the chromosome of the genes encoding the different partners and (2) the compatibility of the family partner domain. Therefore, inputs of this problem are the values of two variables of the BelongTo association (genomicLocalization_begin and genomicLocalization_end) and the value of the DomainFamily variable of the HasDomain association. Solving this problem leads to the instantiation of the Assembly and Subfamily classes, and the IsMemberOf association.
    Solving the SystemReconstruction problem is the last instantiation step of the DKB. However, in order to update, according to the new incoming knowledge, the parameters involved in the bioinformatics programs, a learning step is required. It was modeled in the MKB by the Learning problem.
  • Learning. In order to build the appropriate parameter files for the associated programs, this problem follows the same specialization hierarchy as the problem Prediction. Solving this problem results in the update of the parameter of the prediction methods using the data retrieved from the DKB. For the analysis of new genome, these updated files will correspond to the input of the problem Prediction.


    5 OPERATIONAL ISYMOD
 TOP
 Abstract
 1 INTRODUCTION
 2 BIOLOGICAL INTEGRATED SYSTEMS
 3 KNOWLEDGE REPRESENTATION IN...
 4 ISYMOD SCHEME
 5 OPERATIONAL ISYMOD
 6 DISCUSSION
 REFERENCES
 
ISYMOD models the domain of some integrated systems in such a way that more integrated systems can be added. The automatic programs to identify and reconstruct such systems are also modeled and described: adding a new problem or a new method is a plug-and-play operation. However, ISYMOD is not only a knowledge base: the modeled strategies can be executed, evaluated and their results stored, while AROM checks the consistency of the overall base. This is the reason why we refer to ISYMOD as a knowledge warehouse.

5.1 Processing a strategy
At the execution time, the execution controller allows a dynamic building of processing sequences according to the description and the modeling of problems in the MKB. Based on the matching of the inputs and outputs of each elementary problem, several sequences can be constructed (Fig. 6). Simplest sequences are linear sequences but more complex sequences, named branched sequences, can be constructed by insertion of one or more breakpoints. A breakpoint involves a parallel solving of two or more problems in the same execution sequence. It concerns the elementary problems involved in a same level of specialization in the MKB and it is defined through the description of solving strategies related to higher specialization level problems. For example, in order to solve the PredByProfile problem, the solving strategy specifies three ways: (1) solving the PredByPsiblast problem, (2) solving the PredByHmmer problem and (3) solving these two problems in parallel. The user must then provide the inputs which are appropriate to the expected execution sequence. The first execution sequence on Fig. 6 represents the linear processing of a complete analysis of integrated systems in a new proteome that involves the prediction with the Blast program by solving the PredByBlast problem. The linear execution is often faster, but in order to obtain the largest repertory of integrated systems in a new proteome, it is better to choose a branched sequence, like the second one on Fig. 6, that applied two kinds of prediction methods on one formatted proteome.



View larger version (12K):
[in this window]
[in a new window]
 
Fig. 6 Example of two processing sequences. These sequences share the same goal: to identify and to assembly integrated systems in a new proteome. The first one relies on a prediction by the Blast Program. The last problem belongs to the learning step which is coupled with the used prediction program in this sequence. The second sequence, more complex, involves tree breakpoints represented with vertical dashed lines. The processing involves a prediction by PsiBlast completed by a Mast prediction.

 
5.2 Consistency control
Once the AnalysisofIntegratedsystem problem is solved, all the root classes and associations of the DKB have been instantiated, except the classes Compound and Stimuli, and the corresponding associations. These two classes have been included in the DKB in order to complete the modeling; however their instances are still read from a text file, apart from the general bioinformatics strategy. Except the class Prediction, whose sub-classes are directly instantiated according to the solving method, we decided to instantiate only the root classes, in order to keep the consistency of the whole DKB by taking advantage of the classification mechanism included in AROM. The classification algorithm is launched on all objects of the Subfamily class, which leads it to move down the objects into the most specialized sub-classes (TCS_Subfamily, ABC_Subfamily, ImportABC_Subfamily, ExportABC_Subfamily) by checking the attachment conditions based in that case on the values taken by the variables name and type.

In coordination all objects and tuples of the DKB are classified in specialized sub-classes and sub-associations by applying the propagation algorithm (Chabalier et al., 2003). This algorithm recursively propagates the result of a single object classification to its linked objects, using AROM's association properties which draw controlled paths among objects.


    6 DISCUSSION
 TOP
 Abstract
 1 INTRODUCTION
 2 BIOLOGICAL INTEGRATED SYSTEMS
 3 KNOWLEDGE REPRESENTATION IN...
 4 ISYMOD SCHEME
 5 OPERATIONAL ISYMOD
 6 DISCUSSION
 REFERENCES
 
ISYMOD is a knowledge warehouse since it integrates in the same local environment, data and the methods used to produce these data, in such a way that it can safely not only updates the raw data retrieved from public databases, but also updates the method's parameters, improving the analyzing methods themselves. ISYMOD is built over AROM which ensures the consistency of the data. The computational strategy implemented in ISYMOD has been used to establish the repertory of ABC transporters in more than 100 bacterial genomes. The produced data have been made publicly available via a new release of the specialized database ABCdb (http://www.lcb.cnrs-mrs.fr/~quentin).

The tight connection between the DKB and the MKB is the keystone of our approach, which prevents ISYMOD from being a single knowledge base. The DKB is not only a way to store the data, but it also structures it as a piece of knowledge within a model using an appropriate formal language. The consistency of the expressed biological model is thus automatically ensured by the knowledge representation system. The key point is that these pieces of meaningful data, which we call knowledge, are inputs and outputs of the problems of the MKB. Therefore, any output computed from a given input when solving a problem, is meaningful with regard to the biological model. Consequently, the interpretation of the whole process represented by the tasks layout is ensured step by step by the model, as are both the final result and the consistency of the overall prediction strategy. We think that such an approach, which merges domain knowledge and methodological knowledge thanks to an appropriate KRS, could be used to build knowledge warehouses for other specific biological fields, bringing together the advantages of local data warehouses and the consistency checking capabilities of KRSs.

Only a few approaches incorporating methods and their generated data have been published. Among these RiboWeb (Altman et al., 1999) shares the same underlying philosophy as ISYMOD, but has been applied to different specialized biological fields, and runs over a different representation system. It is a knowledge base that handles data on the ribosomal subunit structures, and methods for comparing and analyzing these data in order to produce new models.

More closely related to ISYMOD in terms of the underlying structure is GenoStar (Durand et al., 2003), which was developed at the same time under the same OBKRS AROM and task manager AROMTASKS. However, both works show quite different aims. On the one hand, GenoStar is a commercial, friendly software that provides a general, fixed instantiable ontology and addresses well-known computational problems concerning mostly the first level of genome annotation, i.e. gene identification and functional predictions. Hence, GenoStar is well-adapted to both the solving of classical comparison problems, and the analysis of prediction results thanks to a graphical interface. But it is not suited for processing some new in silico explorations over specific biological concepts. On the other hand, ISYMOD is a platform which is intended to safely set up and check new methods and strategies for the identification and the reconstruction of functional supra-molecular complexes, which is not a classical problem. ISYMOD has been designed so that the domain and methodological schemas can evolve in order to progressively incorporate higher levels of complexity in the modeling of the biological field, as well as new prediction methods. Indeed, extending the DKB schema to include other systems should be easy enough thanks to the explicit separation of associations and classes, the specialization relation and the recursive classification facilitating object relocation. We can restructure parts of the schema without disturbing the whole. As an example of such an extension, we are currently working on the conservation of the genomic context of the ABC transporters in order to detect new partners in the biological process such as enzymes involved in the metabolism of the transported substrate. In the same way the structure of the MKB allows us to add or remove a task that in our case encapsulates a bioinformatics method, without disturbing the strategy. This modular architecture offers the opportunity to readily test and evaluate different routes available to solve a problem.

In the future we would like to convert ISYMOD into an open resource on the Web, so that other users could participate in its enrichment by integrating their own modules at both domain and methodological levels according to their requirements. ISYMOD is built over AROM which features translators to XML and to some description logics. This should help us to translate ISYMOD into OWL, a standard language devoted to the development of web ontologies (http://www.w3.org/2001/sw/WebOnt).

A more fundamental evolution of our modeling will address the integration of other types of relationships, such as functional and phylogenetic links, as well as a dynamic view that shows how the processes are ordered over time. Modeling these different types of relations requires an extension of AROM's kernel by adding the possibility of describing algebraic properties of the associations and the way they can be composed. Such an enrichment of association specification, combined with the implementation of a time manager, should allow the exploitation of data through the launching of simulations over the relationship network. All these improvements will lead to a more precise modeling of biological processes.


    Acknowledgments
 
We gratefully acknowledge Danielle Ziébelin and François Denis for helpful discussions and Adam Manvell for proofreading of the manuscript. This work was supported by grants from the CNRS (Centre National de la Recherche Scientifique) and ACI ‘Informatique, Mathématiques, Physique en Biologie’ (grant 035360). JC was supported by an MRT fellowship.

Received on April 14, 2004; revised on August 25, 2004; accepted on October 22, 2004

    REFERENCES
 TOP
 Abstract
 1 INTRODUCTION
 2 BIOLOGICAL INTEGRATED SYSTEMS
 3 KNOWLEDGE REPRESENTATION IN...
 4 ISYMOD SCHEME
 5 OPERATIONAL ISYMOD
 6 DISCUSSION
 REFERENCES
 

    Altman, R., Bada, M., Chai, X., Whirl Carillo, M., Chen, R., Abernethy, N.F. (1999) RiboWeb: an ontology-based system for collaborative molecular biology. IEEE Intell. Syst., 14, 68–76.

    Beier, D. and Frank, R. (2000) Molecular characterization of two-component systems of Helicobacter pylori. J. Bacteriol., 182, 2068–2076[Abstract/Free Full Text].

    Braibant, M., Gilot, P., Content, J. (2000) The ATP binding cassette (ABC) transport systems of Mycobacterium tuberculosis. FEMS Microbiol. Rev., 24, 449–467[CrossRef][Web of Science][Medline].

    Bronner, G., Spataro, B., Page, M., Gautier, C., Rechenmann, F. (2002) Modeling comparative mapping using objects and associations. Comput. Chem., 26, 413–420[Medline].

    Capponi, C. and Gensel, J. (2000) Classifications among classes and associations: the AROM's approach. ECOOP2000. Proceedings of the Workshop Objects and Classification: A Natural Convergence, , France Cannes.

    Capponi, C., Chabalier, J., Quentin, Y., Fichant, G. (2001) A knowledge base for integrated biological systems. IEEE Intell. Syst., 16, , pp. 52–60[CrossRef].

    Chabalier, J., Fichant, G., Capponi, C. (2003) La classification récursive dans AROM. Application à l'identification de systèmes biologiques. Revues des Sciences et Technologies de l'Information (RSTI) série l'Objet, 9, 167–181.

    Dassa, E. and Bouiges, P. (2001) The ABC of ABCS: a phylogenetic and functional classification of ABC systems in living organisms. Res. Microbiol., 152, 211–229[Medline].

    Durand, P., Médigue, C., Morgat, A., Vandenbrouck, Y., Viari, A., Rechenmann, F. (2003) Integration of data and methods for genome analysis. Curr. Opin. Drug Discov. Devel., 6, 346–352[Web of Science][Medline].

    Fabret, C., Feher, V.A., Hoch, J.A. (1999) Two-component signal transduction in B. subtilis: how one organism sees its world. J. Bacteriol., 181, 1975–1983[Free Full Text].

    Gene Ontology Consortium. (2004) The Gene Ontology (GO) database and informatics resource. Nucleic Acids Res., 1, D258–D261.

    Higgins, C.F. (2001) ABC transporters: physiology, structure and mechanism—an overview. Res. Microbiol., 152, 205–210[Medline].

    Holland, B. and Blight, M.A. (1999) ABC-ATPases, adaptable energy generators fuelling transmembrane movement of a variety of molecules in organisms from bacteria to humans. J. Mol. Biol., 293, 381–399[CrossRef][Web of Science][Medline].

    Joseph, P., Fichant, G., Quentin, Y., Denizot, F. (2002) Regulatory relationship of two-component and ABC transport systems and clustering of their genes in the Bacillus/Clostridium group, suggest a functional link between them. J. Mol. Microbiol. Biotechnol., 4, 503–513[CrossRef][Medline].

    Karp, P.D., Arnaud, M., Collado-Vides, J., Ingraham, J., Paulsen, I., Saier, M. (2004) The E. coli EcoCyc Database: no longer just a metabolic pathway database. ASM News, 70, 25–30[Web of Science].

    Krieger, C.J., Zhang, P., Mueller, L.A., Wang, A., Paley, S., Arnaud, M., Pick, J., Rhee, S.Y., Karp, P.D. (2004) MetaCyc: a multiorganism database of metabolic pathways and enzymes. Nucleic Acids Res., 1, D438–D442.

    Linton, K.J. and Higgins, C.F. (1998) The Escherichia coli ATP-binding cassette (ABC) proteins. Mol. Microbiol., 28, 5–13[CrossRef][Web of Science][Medline].

    Mizuno, T. (1997) Compilation of all genes encoding two-component phosphotransfer signal transducers in the genome of Escherichia coli. DNA Res., 28, 161–168.

    Nemati, H.R., Steiger, D.M., Iyer, L.S., Herschel, R.T. (2002) Knowledge warehouse: an architectural integration of knowledge management, decision support, artificial intelligence and data warehousing. Decision Support Systems, 33, 143–161[CrossRef].

    Page, M., Gensel, J., Capponi, C., Bruley, C., Genoud, P., Ziebelin, D., Bardou, D., Dupierris, V. (2001) A new approach to object-based knowledge representation: the AROM System. Proceedings of the 14th International Conference on Industrial and Engineering Applications of Artificial Intelligence and Expert Systems (IEAAI&ES 2001) Lecture Notes in Artificial Intelligence.

    Parkinson, J. and Kofoid, E. (1992) Communication modules in bacterial signaling proteins. Annu. Rev. Genet., 26, , pp. 71–112[CrossRef][Web of Science][Medline].

    Parmentier, T. and Ziébelin, D. (1999) Distributed problem solving environment dedicated to DNA sequence annotation. Proceedings of the 11th European Workshop (EKAW'99) , Heidelberg Lecture Notes in Computer Science Springer-Verlag Vol. 1621, , pp. 243.

    Paulsen, I.T., Sliwinski, M.K., Saier, M.H., Jr. (1998) Microbial genome analyses: global comparisons of transport capabilities based on phylogenies, bioenergetics and substrate specificities. J. Mol. Biol., 277, 573–592[CrossRef][Web of Science][Medline].

    Perrière, G., Duret, L., Gouy, M. (2000) HOBACGENE: database system for comparative genomics in bacteria. Genome Res., 10, 379–385[Abstract/Free Full Text].

    Quentin, Y., Fichant, G., Denizot, F. (1999) Inventory, assembly and analysis of Bacillus subtilis ABC transport systems. J. Mol. Biol., 287, 467–484[CrossRef][Web of Science][Medline].

    Quentin, Y. and Fichant, G. (2000) ABCdb: an ABC transporter database. J. Mol. Microbiol. Biotechnol., 2, 501–504[Medline].

    Quentin, Y., Chabalier, J., Fichant, G. (2002) Strategies for the identification, the assembly and the classification of integrated biological systems in completely sequenced genomes. Comput. Chem., 26, 447–457[Medline].

    Rodrigue, A., Quentin, Y., Lazdunski, A., Mejean, V., Foglino, M. (2000) Two-component systems in P. aeruginosa: why so many?. Trends Microbiol., 8, 498–504[CrossRef][Medline].

    Saurin, W., Hofnung, M., Dassa, E. (1999) Getting in or out: early segregation between importers and exporters in the evolution of ATP-binding cassette (ABC) transporters. J. Mol. Evol., 48, 22–41[CrossRef][Web of Science][Medline].

    Stevens, R., Robinson, A., Goble, C.A. (2003) myGrid: personalised bioinformatics on the information grid, proceedings of the 11th IBSM (Brisbane). Bioinformatics, 19, i302–i304[Abstract].

    Taglicht, D. and Michaelis, S. (1998) A complete catalogue of Saccharomyces cerevisiae ABC proteins and their relevance to human health and disease. Methods Enzymol., 292, 130–162[Web of Science][Medline].

    Throup, J.P., Koretke, K.K., Bryant, A.P., Ingraham, K.A., Chalker, A.F., Ge, Y., Marra, A., Wallis, N.G., Brown, J.R., Holmes, D.J., Rosenberg, M., Burnham, M.K. (2000) A genomic analysis of two-component signal transduction in Streptococcus pneumoniae. Mol. Microbiol., 35, 566–576[CrossRef][Web of Science][Medline].

    Tomii, K. and Kanehisa, M. (1998) A comparative analysis of ABC transporters in complete microbial genomes. Genome Res., 8, 1048–1059[Abstract/Free Full Text].


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?



This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow All Versions of this Article:
21/7/1246    most recent
bti137v1
Right arrow Comments: Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when Comments are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (1)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Chabalier, J.
Right arrow Articles by Fichant, G.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Chabalier, J.
Right arrow Articles by Fichant, G.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?